Overview

Dataset statistics

Number of variables42
Number of observations199523
Missing cells0
Missing cells (%)0.0%
Duplicate rows2766
Duplicate rows (%)1.4%
Total size in memory63.9 MiB
Average record size in memory336.0 B

Variable types

Numeric10
Categorical32

Warnings

Dataset has 2766 (1.4%) duplicate rowsDuplicates
state_of_previous_residence has a high cardinality: 51 distinct values High cardinality
num_persons_worked_for_employer is highly correlated with weeks_worked_in_yearHigh correlation
weeks_worked_in_year is highly correlated with num_persons_worked_for_employerHigh correlation
num_persons_worked_for_employer is highly correlated with weeks_worked_in_yearHigh correlation
weeks_worked_in_year is highly correlated with num_persons_worked_for_employerHigh correlation
age is highly correlated with wage_per_hour and 4 other fieldsHigh correlation
wage_per_hour is highly correlated with age and 4 other fieldsHigh correlation
capital_gains is highly correlated with age and 4 other fieldsHigh correlation
capital_losses is highly correlated with age and 4 other fieldsHigh correlation
divdends_from_stocks is highly correlated with age and 4 other fieldsHigh correlation
instance_weight is highly correlated with labelHigh correlation
num_persons_worked_for_employer is highly correlated with labelHigh correlation
weeks_worked_in_year is highly correlated with labelHigh correlation
label is highly correlated with age and 7 other fieldsHigh correlation
race is highly correlated with country_of_birth_mother and 2 other fieldsHigh correlation
tax_filer_status is highly correlated with weeks_worked_in_year and 12 other fieldsHigh correlation
migration_code_change_in_msa is highly correlated with migration_code_move_within_reg and 7 other fieldsHigh correlation
migration_code_move_within_reg is highly correlated with migration_code_change_in_msa and 7 other fieldsHigh correlation
country_of_birth_mother is highly correlated with race and 4 other fieldsHigh correlation
weeks_worked_in_year is highly correlated with tax_filer_status and 10 other fieldsHigh correlation
member_of_a_labor_union is highly correlated with class_of_worker and 1 other fieldsHigh correlation
citizenship is highly correlated with country_of_birth_mother and 3 other fieldsHigh correlation
hispanic_Origin is highly correlated with country_of_birth_mother and 3 other fieldsHigh correlation
full_or_part_time_employment_stat is highly correlated with migration_code_change_in_msa and 6 other fieldsHigh correlation
region_of_previous_residence is highly correlated with migration_code_change_in_msa and 5 other fieldsHigh correlation
migration_code_change_in_reg is highly correlated with migration_code_change_in_msa and 7 other fieldsHigh correlation
detailed_household_summary_in_household is highly correlated with tax_filer_status and 7 other fieldsHigh correlation
live_in_this_house_1_year_ago is highly correlated with migration_code_change_in_msa and 7 other fieldsHigh correlation
year is highly correlated with migration_code_change_in_msa and 5 other fieldsHigh correlation
veterans_benefits is highly correlated with tax_filer_status and 13 other fieldsHigh correlation
education is highly correlated with tax_filer_status and 12 other fieldsHigh correlation
major_occupation_code is highly correlated with tax_filer_status and 11 other fieldsHigh correlation
state_of_previous_residence is highly correlated with migration_code_change_in_msa and 5 other fieldsHigh correlation
class_of_worker is highly correlated with tax_filer_status and 12 other fieldsHigh correlation
sex is highly correlated with detailed_household_and_family_statHigh correlation
wage_per_hour is highly correlated with member_of_a_labor_unionHigh correlation
num_persons_worked_for_employer is highly correlated with tax_filer_status and 10 other fieldsHigh correlation
migration_prev_res_in_sunbelt is highly correlated with migration_code_change_in_msa and 7 other fieldsHigh correlation
age is highly correlated with tax_filer_status and 14 other fieldsHigh correlation
country_of_birth_self is highly correlated with race and 4 other fieldsHigh correlation
major_industry_code is highly correlated with tax_filer_status and 12 other fieldsHigh correlation
industry_code is highly correlated with tax_filer_status and 9 other fieldsHigh correlation
detailed_household_and_family_stat is highly correlated with tax_filer_status and 12 other fieldsHigh correlation
family_members_under_18 is highly correlated with tax_filer_status and 11 other fieldsHigh correlation
marital_status is highly correlated with tax_filer_status and 6 other fieldsHigh correlation
reason_for_unemployment is highly correlated with class_of_workerHigh correlation
country_of_birth_father is highly correlated with race and 4 other fieldsHigh correlation
fill_inc_questionnaire_for_veterans_admin is highly correlated with veterans_benefitsHigh correlation
occupation_code is highly correlated with weeks_worked_in_year and 8 other fieldsHigh correlation
enrolled_in_edu_inst_last_wk is highly correlated with education and 2 other fieldsHigh correlation
tax_filer_status is highly correlated with veterans_benefits and 1 other fieldsHigh correlation
migration_code_change_in_msa is highly correlated with migration_code_move_within_reg and 5 other fieldsHigh correlation
migration_code_move_within_reg is highly correlated with migration_code_change_in_msa and 6 other fieldsHigh correlation
country_of_birth_mother is highly correlated with citizenship and 3 other fieldsHigh correlation
citizenship is highly correlated with country_of_birth_mother and 2 other fieldsHigh correlation
hispanic_Origin is highly correlated with country_of_birth_mother and 1 other fieldsHigh correlation
full_or_part_time_employment_stat is highly correlated with live_in_this_house_1_year_ago and 1 other fieldsHigh correlation
migration_code_change_in_reg is highly correlated with migration_code_change_in_msa and 5 other fieldsHigh correlation
region_of_previous_residence is highly correlated with migration_code_change_in_msa and 5 other fieldsHigh correlation
detailed_household_summary_in_household is highly correlated with veterans_benefits and 2 other fieldsHigh correlation
live_in_this_house_1_year_ago is highly correlated with migration_code_change_in_msa and 7 other fieldsHigh correlation
year is highly correlated with migration_code_change_in_msa and 5 other fieldsHigh correlation
veterans_benefits is highly correlated with tax_filer_status and 5 other fieldsHigh correlation
education is highly correlated with veterans_benefitsHigh correlation
major_occupation_code is highly correlated with major_industry_codeHigh correlation
state_of_previous_residence is highly correlated with migration_code_move_within_reg and 3 other fieldsHigh correlation
migration_prev_res_in_sunbelt is highly correlated with migration_code_change_in_msa and 6 other fieldsHigh correlation
country_of_birth_self is highly correlated with country_of_birth_mother and 2 other fieldsHigh correlation
major_industry_code is highly correlated with major_occupation_codeHigh correlation
detailed_household_and_family_stat is highly correlated with tax_filer_status and 3 other fieldsHigh correlation
family_members_under_18 is highly correlated with detailed_household_summary_in_household and 2 other fieldsHigh correlation
country_of_birth_father is highly correlated with country_of_birth_mother and 3 other fieldsHigh correlation
fill_inc_questionnaire_for_veterans_admin is highly correlated with veterans_benefitsHigh correlation
divdends_from_stocks is highly skewed (γ1 = 27.78650179) Skewed
age has 2839 (1.4%) zeros Zeros
industry_code has 100684 (50.5%) zeros Zeros
occupation_code has 100684 (50.5%) zeros Zeros
wage_per_hour has 188219 (94.3%) zeros Zeros
capital_gains has 192144 (96.3%) zeros Zeros
capital_losses has 195617 (98.0%) zeros Zeros
divdends_from_stocks has 178382 (89.4%) zeros Zeros
num_persons_worked_for_employer has 95983 (48.1%) zeros Zeros
weeks_worked_in_year has 95983 (48.1%) zeros Zeros

Reproduction

Analysis started2021-09-08 17:05:06.466431
Analysis finished2021-09-08 17:23:18.238996
Duration18 minutes and 11.77 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

age
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct91
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean34.49419866
Minimum0
Maximum90
Zeros2839
Zeros (%)1.4%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2021-09-08T19:23:18.352789image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile3
Q115
median33
Q350
95-th percentile75
Maximum90
Range90
Interquartile range (IQR)35

Descriptive statistics

Standard deviation22.31089521
Coefficient of variation (CV)0.6468013774
Kurtosis-0.7328243009
Mean34.49419866
Median Absolute Deviation (MAD)17
Skewness0.3732904573
Sum6882386
Variance497.7760449
MonotonicityNot monotonic
2021-09-08T19:23:18.493720image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
343489
 
1.7%
353450
 
1.7%
363353
 
1.7%
313351
 
1.7%
333340
 
1.7%
53332
 
1.7%
43318
 
1.7%
33279
 
1.6%
373278
 
1.6%
383277
 
1.6%
Other values (81)166056
83.2%
ValueCountFrequency (%)
02839
1.4%
13138
1.6%
23236
1.6%
33279
1.6%
43318
1.7%
53332
1.7%
63171
1.6%
73218
1.6%
83187
1.6%
93162
1.6%
ValueCountFrequency (%)
90725
0.4%
89195
 
0.1%
88241
 
0.1%
87301
0.2%
86348
0.2%
85423
0.2%
84519
0.3%
83561
0.3%
82615
0.3%
81720
0.4%

class_of_worker
Categorical

HIGH CORRELATION

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe
100245 
Private
72028 
Self-employed-not incorporated
 
8445
Local government
 
7784
State government
 
4227
Other values (4)
 
6794

Length

Max length31
Median length16
Mean length14.02115546
Min length8

Characters and Unicode

Total characters2797543
Distinct characters29
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe
2nd row Self-employed-not incorporated
3rd row Not in universe
4th row Not in universe
5th row Not in universe

Common Values

ValueCountFrequency (%)
Not in universe100245
50.2%
Private72028
36.1%
Self-employed-not incorporated8445
 
4.2%
Local government7784
 
3.9%
State government4227
 
2.1%
Self-employed-incorporated3265
 
1.6%
Federal government2925
 
1.5%
Never worked439
 
0.2%
Without pay165
 
0.1%

Length

2021-09-08T19:23:18.732684image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-08T19:23:18.813926image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
not100245
23.6%
in100245
23.6%
universe100245
23.6%
private72028
17.0%
government14936
 
3.5%
self-employed-not8445
 
2.0%
incorporated8445
 
2.0%
local7784
 
1.8%
state4227
 
1.0%
self-employed-incorporated3265
 
0.8%
Other values (5)4133
 
1.0%

Most occurring characters

ValueCountFrequency (%)
423998
15.2%
e360624
12.9%
i284393
10.2%
n250517
9.0%
t216148
7.7%
r214432
7.7%
v187648
 
6.7%
o167144
 
6.0%
N100684
 
3.6%
u100410
 
3.6%
Other values (19)491545
17.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2150602
76.9%
Space Separator423998
 
15.2%
Uppercase Letter199523
 
7.1%
Dash Punctuation23420
 
0.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e360624
16.8%
i284393
13.2%
n250517
11.6%
t216148
10.1%
r214432
10.0%
v187648
8.7%
o167144
7.8%
u100410
 
4.7%
s100245
 
4.7%
a98839
 
4.6%
Other values (11)170202
7.9%
Uppercase Letter
ValueCountFrequency (%)
N100684
50.5%
P72028
36.1%
S15937
 
8.0%
L7784
 
3.9%
F2925
 
1.5%
W165
 
0.1%
Space Separator
ValueCountFrequency (%)
423998
100.0%
Dash Punctuation
ValueCountFrequency (%)
-23420
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2350125
84.0%
Common447418
 
16.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e360624
15.3%
i284393
12.1%
n250517
10.7%
t216148
9.2%
r214432
9.1%
v187648
8.0%
o167144
7.1%
N100684
 
4.3%
u100410
 
4.3%
s100245
 
4.3%
Other values (17)367880
15.7%
Common
ValueCountFrequency (%)
423998
94.8%
-23420
 
5.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII2797543
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
423998
15.2%
e360624
12.9%
i284393
10.2%
n250517
9.0%
t216148
7.7%
r214432
7.7%
v187648
 
6.7%
o167144
 
6.0%
N100684
 
3.6%
u100410
 
3.6%
Other values (19)491545
17.6%

industry_code
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct52
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.35232028
Minimum0
Maximum51
Zeros100684
Zeros (%)50.5%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2021-09-08T19:23:18.936470image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q333
95-th percentile44
Maximum51
Range51
Interquartile range (IQR)33

Descriptive statistics

Standard deviation18.0671288
Coefficient of variation (CV)1.17683376
Kurtosis-1.501107921
Mean15.35232028
Median Absolute Deviation (MAD)0
Skewness0.5166876791
Sum3063141
Variance326.421143
MonotonicityNot monotonic
2021-09-08T19:23:19.073853image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0100684
50.5%
3317070
 
8.6%
438283
 
4.2%
45984
 
3.0%
424683
 
2.3%
454482
 
2.2%
294209
 
2.1%
374022
 
2.0%
413964
 
2.0%
323596
 
1.8%
Other values (42)42546
21.3%
ValueCountFrequency (%)
0100684
50.5%
1827
 
0.4%
22196
 
1.1%
3563
 
0.3%
45984
 
3.0%
5553
 
0.3%
6554
 
0.3%
7422
 
0.2%
8550
 
0.3%
9993
 
0.5%
ValueCountFrequency (%)
5136
 
< 0.1%
501704
 
0.9%
49610
 
0.3%
48652
 
0.3%
471644
 
0.8%
46187
 
0.1%
454482
2.2%
442549
 
1.3%
438283
4.2%
424683
2.3%

occupation_code
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct47
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean11.30655614
Minimum0
Maximum46
Zeros100684
Zeros (%)50.5%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2021-09-08T19:23:19.202704image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q326
95-th percentile38
Maximum46
Range46
Interquartile range (IQR)26

Descriptive statistics

Standard deviation14.45420392
Coefficient of variation (CV)1.278391381
Kurtosis-0.8965333655
Mean11.30655614
Median Absolute Deviation (MAD)0
Skewness0.829238138
Sum2255918
Variance208.9240109
MonotonicityNot monotonic
2021-09-08T19:23:19.334874image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=47)
ValueCountFrequency (%)
0100684
50.5%
28756
 
4.4%
267887
 
4.0%
195413
 
2.7%
295105
 
2.6%
364145
 
2.1%
344025
 
2.0%
103683
 
1.8%
163445
 
1.7%
233392
 
1.7%
Other values (37)52988
26.6%
ValueCountFrequency (%)
0100684
50.5%
1544
 
0.3%
28756
 
4.4%
33195
 
1.6%
41364
 
0.7%
5855
 
0.4%
6441
 
0.2%
7731
 
0.4%
82151
 
1.1%
9738
 
0.4%
ValueCountFrequency (%)
4636
 
< 0.1%
45172
 
0.1%
441592
0.8%
431382
0.7%
421918
1.0%
411592
0.8%
40617
 
0.3%
391017
 
0.5%
383003
1.5%
372234
1.1%

education
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
High school graduate
48407 
Children
47422 
Some college but no degree
27820 
Bachelors degree(BA AB BS)
19865 
7th and 8th grade
8007 
Other values (12)
48002 

Length

Max length39
Median length21
Mean length19.86398561
Min length9

Characters and Unicode

Total characters3963322
Distinct characters47
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row High school graduate
2nd row Some college but no degree
3rd row 10th grade
4th row Children
5th row Children

Common Values

ValueCountFrequency (%)
High school graduate48407
24.3%
Children47422
23.8%
Some college but no degree27820
13.9%
Bachelors degree(BA AB BS)19865
10.0%
7th and 8th grade8007
 
4.0%
10th grade7557
 
3.8%
11th grade6876
 
3.4%
Masters degree(MA MS MEng MEd MSW MBA)6541
 
3.3%
9th grade6230
 
3.1%
Associates degree-occup /vocational5358
 
2.7%
Other values (7)15440
 
7.7%

Length

2021-09-08T19:23:19.599232image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
school50200
 
8.2%
graduate48407
 
7.9%
high48407
 
7.9%
children47422
 
7.7%
grade36691
 
6.0%
no29946
 
4.9%
degree29613
 
4.8%
some27820
 
4.5%
college27820
 
4.5%
but27820
 
4.5%
Other values (42)239176
39.0%

Most occurring characters

ValueCountFrequency (%)
613322
15.5%
e459561
 
11.6%
o247530
 
6.2%
r244586
 
6.2%
g239232
 
6.0%
d225421
 
5.7%
h215132
 
5.4%
a205652
 
5.2%
l180611
 
4.6%
t150966
 
3.8%
Other values (37)1181309
29.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2803290
70.7%
Space Separator613322
 
15.5%
Uppercase Letter402776
 
10.2%
Decimal Number69931
 
1.8%
Open Punctuation29462
 
0.7%
Close Punctuation29462
 
0.7%
Dash Punctuation9721
 
0.2%
Other Punctuation5358
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e459561
16.4%
o247530
8.8%
r244586
8.7%
g239232
8.5%
d225421
8.0%
h215132
7.7%
a205652
7.3%
l180611
 
6.4%
t150966
 
5.4%
c133669
 
4.8%
Other values (9)500930
17.9%
Uppercase Letter
ValueCountFrequency (%)
B87794
21.8%
S62560
15.5%
A62533
15.5%
M49373
12.3%
H48407
12.0%
C47422
11.8%
E14345
 
3.6%
D12754
 
3.2%
W6541
 
1.6%
L4405
 
1.1%
Other values (3)6642
 
1.6%
Decimal Number
ValueCountFrequency (%)
126053
37.3%
78007
 
11.4%
88007
 
11.4%
07557
 
10.8%
96230
 
8.9%
23925
 
5.6%
53277
 
4.7%
63277
 
4.7%
31799
 
2.6%
41799
 
2.6%
Space Separator
ValueCountFrequency (%)
613322
100.0%
Open Punctuation
ValueCountFrequency (%)
(29462
100.0%
Close Punctuation
ValueCountFrequency (%)
)29462
100.0%
Dash Punctuation
ValueCountFrequency (%)
-9721
100.0%
Other Punctuation
ValueCountFrequency (%)
/5358
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin3206066
80.9%
Common757256
 
19.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e459561
14.3%
o247530
 
7.7%
r244586
 
7.6%
g239232
 
7.5%
d225421
 
7.0%
h215132
 
6.7%
a205652
 
6.4%
l180611
 
5.6%
t150966
 
4.7%
c133669
 
4.2%
Other values (22)903706
28.2%
Common
ValueCountFrequency (%)
613322
81.0%
(29462
 
3.9%
)29462
 
3.9%
126053
 
3.4%
-9721
 
1.3%
78007
 
1.1%
88007
 
1.1%
07557
 
1.0%
96230
 
0.8%
/5358
 
0.7%
Other values (5)14077
 
1.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII3963322
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
613322
15.5%
e459561
 
11.6%
o247530
 
6.2%
r244586
 
6.2%
g239232
 
6.0%
d225421
 
5.7%
h215132
 
5.4%
a205652
 
5.2%
l180611
 
4.6%
t150966
 
3.8%
Other values (37)1181309
29.8%

wage_per_hour
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct1240
Distinct (%)0.6%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean55.42690818
Minimum0
Maximum9999
Zeros188219
Zeros (%)94.3%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2021-09-08T19:23:19.721957image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile495
Maximum9999
Range9999
Interquartile range (IQR)0

Descriptive statistics

Standard deviation274.8964539
Coefficient of variation (CV)4.959620931
Kurtosis155.2188969
Mean55.42690818
Median Absolute Deviation (MAD)0
Skewness8.935096531
Sum11058943
Variance75568.06037
MonotonicityNot monotonic
2021-09-08T19:23:19.863243image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0188219
94.3%
500734
 
0.4%
600546
 
0.3%
700534
 
0.3%
800507
 
0.3%
1000386
 
0.2%
425376
 
0.2%
900336
 
0.2%
550280
 
0.1%
1200256
 
0.1%
Other values (1230)7349
 
3.7%
ValueCountFrequency (%)
0188219
94.3%
201
 
< 0.1%
701
 
< 0.1%
752
 
< 0.1%
10011
 
< 0.1%
1101
 
< 0.1%
1251
 
< 0.1%
1351
 
< 0.1%
1431
 
< 0.1%
1506
 
< 0.1%
ValueCountFrequency (%)
99991
 
< 0.1%
99161
 
< 0.1%
98002
< 0.1%
94002
< 0.1%
90001
 
< 0.1%
88001
 
< 0.1%
86001
 
< 0.1%
85001
 
< 0.1%
83001
 
< 0.1%
80004
< 0.1%

enrolled_in_edu_inst_last_wk
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe
186943 
High school
 
6892
College or university
 
5688

Length

Max length22
Median length16
Mean length16.03287842
Min length12

Characters and Unicode

Total characters3198928
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe
2nd row Not in universe
3rd row High school
4th row Not in universe
5th row Not in universe

Common Values

ValueCountFrequency (%)
Not in universe186943
93.7%
High school6892
 
3.5%
College or university5688
 
2.9%

Length

2021-09-08T19:23:20.101678image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-08T19:23:20.167787image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
not186943
31.6%
in186943
31.6%
universe186943
31.6%
high6892
 
1.2%
school6892
 
1.2%
college5688
 
1.0%
or5688
 
1.0%
university5688
 
1.0%

Most occurring characters

ValueCountFrequency (%)
591677
18.5%
i392154
12.3%
e390950
12.2%
n379574
11.9%
o212103
 
6.6%
s199523
 
6.2%
r198319
 
6.2%
t192631
 
6.0%
u192631
 
6.0%
v192631
 
6.0%
Other values (8)256735
8.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2407728
75.3%
Space Separator591677
 
18.5%
Uppercase Letter199523
 
6.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i392154
16.3%
e390950
16.2%
n379574
15.8%
o212103
8.8%
s199523
8.3%
r198319
8.2%
t192631
8.0%
u192631
8.0%
v192631
8.0%
l18268
 
0.8%
Other values (4)38944
 
1.6%
Uppercase Letter
ValueCountFrequency (%)
N186943
93.7%
H6892
 
3.5%
C5688
 
2.9%
Space Separator
ValueCountFrequency (%)
591677
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2607251
81.5%
Common591677
 
18.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
i392154
15.0%
e390950
15.0%
n379574
14.6%
o212103
8.1%
s199523
7.7%
r198319
7.6%
t192631
7.4%
u192631
7.4%
v192631
7.4%
N186943
7.2%
Other values (7)69792
 
2.7%
Common
ValueCountFrequency (%)
591677
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII3198928
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
591677
18.5%
i392154
12.3%
e390950
12.2%
n379574
11.9%
o212103
 
6.6%
s199523
 
6.2%
r198319
 
6.2%
t192631
 
6.0%
u192631
 
6.0%
v192631
 
6.0%
Other values (8)256735
8.0%

marital_status
Categorical

HIGH CORRELATION

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Never married
86485 
Married-civilian spouse present
84222 
Divorced
12710 
Widowed
10463 
Separated
 
3460
Other values (2)
 
2183

Length

Max length32
Median length14
Mean length20.99977947
Min length8

Characters and Unicode

Total characters4189939
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Widowed
2nd row Divorced
3rd row Never married
4th row Never married
5th row Never married

Common Values

ValueCountFrequency (%)
Never married86485
43.3%
Married-civilian spouse present84222
42.2%
Divorced12710
 
6.4%
Widowed10463
 
5.2%
Separated3460
 
1.7%
Married-spouse absent1518
 
0.8%
Married-A F spouse present665
 
0.3%

Length

2021-09-08T19:23:20.339948image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-08T19:23:20.414379image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
never86485
18.9%
married86485
18.9%
spouse84887
18.5%
present84887
18.5%
married-civilian84222
18.4%
divorced12710
 
2.8%
widowed10463
 
2.3%
separated3460
 
0.8%
married-spouse1518
 
0.3%
absent1518
 
0.3%
Other values (2)1330
 
0.3%

Most occurring characters

ValueCountFrequency (%)
e633650
15.1%
r533322
12.7%
457965
10.9%
i448729
10.7%
a265550
 
6.3%
s259215
 
6.2%
d209986
 
5.0%
v183417
 
4.4%
p174752
 
4.2%
n170627
 
4.1%
Other values (16)852726
20.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3444716
82.2%
Space Separator457965
 
10.9%
Uppercase Letter200853
 
4.8%
Dash Punctuation86405
 
2.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e633650
18.4%
r533322
15.5%
i448729
13.0%
a265550
7.7%
s259215
7.5%
d209986
 
6.1%
v183417
 
5.3%
p174752
 
5.1%
n170627
 
5.0%
o109578
 
3.2%
Other values (7)455890
13.2%
Uppercase Letter
ValueCountFrequency (%)
N86485
43.1%
M86405
43.0%
D12710
 
6.3%
W10463
 
5.2%
S3460
 
1.7%
A665
 
0.3%
F665
 
0.3%
Space Separator
ValueCountFrequency (%)
457965
100.0%
Dash Punctuation
ValueCountFrequency (%)
-86405
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin3645569
87.0%
Common544370
 
13.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e633650
17.4%
r533322
14.6%
i448729
12.3%
a265550
7.3%
s259215
 
7.1%
d209986
 
5.8%
v183417
 
5.0%
p174752
 
4.8%
n170627
 
4.7%
o109578
 
3.0%
Other values (14)656743
18.0%
Common
ValueCountFrequency (%)
457965
84.1%
-86405
 
15.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII4189939
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e633650
15.1%
r533322
12.7%
457965
10.9%
i448729
10.7%
a265550
 
6.3%
s259215
 
6.2%
d209986
 
5.0%
v183417
 
4.4%
p174752
 
4.2%
n170627
 
4.1%
Other values (16)852726
20.4%

major_industry_code
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct24
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe or children
100684 
Retail trade
17070 
Manufacturing-durable goods
 
9015
Education
 
8283
Manufacturing-nondurable goods
 
6897
Other values (19)
57574 

Length

Max length36
Median length28
Mean length24.39614982
Min length7

Characters and Unicode

Total characters4867593
Distinct characters38
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe or children
2nd row Construction
3rd row Not in universe or children
4th row Not in universe or children
5th row Not in universe or children

Common Values

ValueCountFrequency (%)
Not in universe or children100684
50.5%
Retail trade17070
 
8.6%
Manufacturing-durable goods9015
 
4.5%
Education8283
 
4.2%
Manufacturing-nondurable goods6897
 
3.5%
Finance insurance and real estate6145
 
3.1%
Construction5984
 
3.0%
Business and repair services5651
 
2.8%
Medical except hospital4683
 
2.3%
Public administration4610
 
2.3%
Other values (14)30501
 
15.3%

Length

2021-09-08T19:23:20.667185image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
not100684
13.8%
universe100684
13.8%
or100684
13.8%
children100684
13.8%
in100684
13.8%
services21706
 
3.0%
trade20666
 
2.8%
retail17070
 
2.3%
goods15912
 
2.2%
and13161
 
1.8%
Other values (34)135470
18.6%

Most occurring characters

ValueCountFrequency (%)
727405
14.9%
e493118
10.1%
i454739
 
9.3%
n445989
 
9.2%
r444143
 
9.1%
o304536
 
6.3%
t242020
 
5.0%
s233277
 
4.8%
a190749
 
3.9%
c188561
 
3.9%
Other values (28)1143056
23.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3918843
80.5%
Space Separator727405
 
14.9%
Uppercase Letter205433
 
4.2%
Dash Punctuation15912
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e493118
12.6%
i454739
11.6%
n445989
11.4%
r444143
11.3%
o304536
7.8%
t242020
 
6.2%
s233277
 
6.0%
a190749
 
4.9%
c188561
 
4.8%
u187265
 
4.8%
Other values (11)734446
18.7%
Uppercase Letter
ValueCountFrequency (%)
N100684
49.0%
M21158
 
10.3%
R17070
 
8.3%
E9934
 
4.8%
H9838
 
4.8%
P8492
 
4.1%
C7165
 
3.5%
F6368
 
3.1%
B5651
 
2.8%
O4482
 
2.2%
Other values (5)14591
 
7.1%
Space Separator
ValueCountFrequency (%)
727405
100.0%
Dash Punctuation
ValueCountFrequency (%)
-15912
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin4124276
84.7%
Common743317
 
15.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e493118
12.0%
i454739
11.0%
n445989
10.8%
r444143
10.8%
o304536
 
7.4%
t242020
 
5.9%
s233277
 
5.7%
a190749
 
4.6%
c188561
 
4.6%
u187265
 
4.5%
Other values (26)939879
22.8%
Common
ValueCountFrequency (%)
727405
97.9%
-15912
 
2.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII4867593
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
727405
14.9%
e493118
10.1%
i454739
 
9.3%
n445989
 
9.2%
r444143
 
9.1%
o304536
 
6.3%
t242020
 
5.0%
s233277
 
4.8%
a190749
 
3.9%
c188561
 
3.9%
Other values (28)1143056
23.5%

major_occupation_code
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct15
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe
100684 
Adm support including clerical
14837 
Professional specialty
13940 
Executive admin and managerial
12495 
Other service
12099 
Other values (10)
45468 

Length

Max length38
Median length16
Mean length20.76417756
Min length6

Characters and Unicode

Total characters4142931
Distinct characters34
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe
2nd row Precision production craft & repair
3rd row Not in universe
4th row Not in universe
5th row Not in universe

Common Values

ValueCountFrequency (%)
Not in universe100684
50.5%
Adm support including clerical14837
 
7.4%
Professional specialty13940
 
7.0%
Executive admin and managerial12495
 
6.3%
Other service12099
 
6.1%
Sales11783
 
5.9%
Precision production craft & repair10518
 
5.3%
Machine operators assmblrs & inspctrs6379
 
3.2%
Handlers equip cleaners etc 4127
 
2.1%
Transportation and material moving4020
 
2.0%
Other values (5)8641
 
4.3%

Length

2021-09-08T19:23:20.889894image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
not100684
16.2%
universe100684
16.2%
in100684
16.2%
and22679
 
3.6%
support17855
 
2.9%
16897
 
2.7%
including14837
 
2.4%
adm14837
 
2.4%
clerical14837
 
2.4%
specialty13940
 
2.2%
Other values (33)204770
32.9%

Most occurring characters

ValueCountFrequency (%)
626831
15.1%
i414716
10.0%
e410135
9.9%
n359087
 
8.7%
r299839
 
7.2%
s260315
 
6.3%
t217320
 
5.2%
o209194
 
5.0%
a201628
 
4.9%
u161296
 
3.9%
Other values (24)982570
23.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3299644
79.6%
Space Separator626831
 
15.1%
Uppercase Letter199559
 
4.8%
Other Punctuation16897
 
0.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i414716
12.6%
e410135
12.4%
n359087
10.9%
r299839
9.1%
s260315
7.9%
t217320
 
6.6%
o209194
 
6.3%
a201628
 
6.1%
u161296
 
4.9%
c145785
 
4.4%
Other values (12)620329
18.8%
Uppercase Letter
ValueCountFrequency (%)
N100684
50.5%
P26899
 
13.5%
A14873
 
7.5%
E12495
 
6.3%
O12099
 
6.1%
S11783
 
5.9%
T7038
 
3.5%
M6379
 
3.2%
H4127
 
2.1%
F3182
 
1.6%
Space Separator
ValueCountFrequency (%)
626831
100.0%
Other Punctuation
ValueCountFrequency (%)
&16897
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin3499203
84.5%
Common643728
 
15.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
i414716
11.9%
e410135
11.7%
n359087
10.3%
r299839
 
8.6%
s260315
 
7.4%
t217320
 
6.2%
o209194
 
6.0%
a201628
 
5.8%
u161296
 
4.6%
c145785
 
4.2%
Other values (22)819888
23.4%
Common
ValueCountFrequency (%)
626831
97.4%
&16897
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII4142931
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
626831
15.1%
i414716
10.0%
e410135
9.9%
n359087
 
8.7%
r299839
 
7.2%
s260315
 
6.3%
t217320
 
5.2%
o209194
 
5.0%
a201628
 
4.9%
u161296
 
3.9%
Other values (24)982570
23.7%

race
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
White
167365 
Black
20415 
Asian or Pacific Islander
 
5835
Other
 
3657
Amer Indian Aleut or Eskimo
 
2251

Length

Max length28
Median length6
Mean length6.833096936
Min length6

Characters and Unicode

Total characters1363360
Distinct characters24
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row White
2nd row White
3rd row Asian or Pacific Islander
4th row White
5th row White

Common Values

ValueCountFrequency (%)
White167365
83.9%
Black20415
 
10.2%
Asian or Pacific Islander5835
 
2.9%
Other3657
 
1.8%
Amer Indian Aleut or Eskimo2251
 
1.1%

Length

2021-09-08T19:23:21.102801image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-08T19:23:21.171037image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
white167365
74.0%
black20415
 
9.0%
or8086
 
3.6%
asian5835
 
2.6%
pacific5835
 
2.6%
islander5835
 
2.6%
other3657
 
1.6%
amer2251
 
1.0%
indian2251
 
1.0%
aleut2251
 
1.0%

Most occurring characters

ValueCountFrequency (%)
226032
16.6%
i189372
13.9%
e181359
13.3%
t173273
12.7%
h171022
12.5%
W167365
12.3%
a40171
 
2.9%
c32085
 
2.4%
l28501
 
2.1%
k22666
 
1.7%
Other values (14)131514
9.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter919382
67.4%
Space Separator226032
 
16.6%
Uppercase Letter217946
 
16.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i189372
20.6%
e181359
19.7%
t173273
18.8%
h171022
18.6%
a40171
 
4.4%
c32085
 
3.5%
l28501
 
3.1%
k22666
 
2.5%
r19829
 
2.2%
n16172
 
1.8%
Other values (6)44932
 
4.9%
Uppercase Letter
ValueCountFrequency (%)
W167365
76.8%
B20415
 
9.4%
A10337
 
4.7%
I8086
 
3.7%
P5835
 
2.7%
O3657
 
1.7%
E2251
 
1.0%
Space Separator
ValueCountFrequency (%)
226032
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1137328
83.4%
Common226032
 
16.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
i189372
16.7%
e181359
15.9%
t173273
15.2%
h171022
15.0%
W167365
14.7%
a40171
 
3.5%
c32085
 
2.8%
l28501
 
2.5%
k22666
 
2.0%
B20415
 
1.8%
Other values (13)111099
9.8%
Common
ValueCountFrequency (%)
226032
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1363360
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
226032
16.6%
i189372
13.9%
e181359
13.3%
t173273
12.7%
h171022
12.5%
W167365
12.3%
a40171
 
2.9%
c32085
 
2.4%
l28501
 
2.1%
k22666
 
1.7%
Other values (14)131514
9.6%

hispanic_Origin
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
All other
171907 
Mexican-American
 
8079
Mexican (Mexicano)
 
7234
Central or South American
 
3895
Puerto Rican
 
3313
Other values (5)
 
5095

Length

Max length26
Median length10
Mean length10.9685099
Min length3

Characters and Unicode

Total characters2188470
Distinct characters31
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row All other
2nd row All other
3rd row All other
4th row All other
5th row All other

Common Values

ValueCountFrequency (%)
All other171907
86.2%
Mexican-American8079
 
4.0%
Mexican (Mexicano)7234
 
3.6%
Central or South American3895
 
2.0%
Puerto Rican3313
 
1.7%
Other Spanish2485
 
1.2%
Cuban1126
 
0.6%
NA874
 
0.4%
Do not know306
 
0.2%
Chicano304
 
0.2%

Length

2021-09-08T19:23:21.389235image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-08T19:23:21.467407image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
other174392
44.0%
all171907
43.3%
mexican-american8079
 
2.0%
mexicano7234
 
1.8%
mexican7234
 
1.8%
central3895
 
1.0%
or3895
 
1.0%
south3895
 
1.0%
american3895
 
1.0%
puerto3313
 
0.8%
Other values (8)9020
 
2.3%

Most occurring characters

ValueCountFrequency (%)
396759
18.1%
l347709
15.9%
e216121
9.9%
r197469
9.0%
o191466
8.7%
t185801
8.5%
A184755
8.4%
h181076
8.3%
n46256
 
2.1%
a45644
 
2.1%
Other values (21)195414
8.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1539866
70.4%
Space Separator396759
 
18.1%
Uppercase Letter229298
 
10.5%
Dash Punctuation8079
 
0.4%
Open Punctuation7234
 
0.3%
Close Punctuation7234
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l347709
22.6%
e216121
14.0%
r197469
12.8%
o191466
12.4%
t185801
12.1%
h181076
11.8%
n46256
 
3.0%
a45644
 
3.0%
i40623
 
2.6%
c38138
 
2.5%
Other values (8)49563
 
3.2%
Uppercase Letter
ValueCountFrequency (%)
A184755
80.6%
M22547
 
9.8%
S6380
 
2.8%
C5325
 
2.3%
P3313
 
1.4%
R3313
 
1.4%
O2485
 
1.1%
N874
 
0.4%
D306
 
0.1%
Space Separator
ValueCountFrequency (%)
396759
100.0%
Open Punctuation
ValueCountFrequency (%)
(7234
100.0%
Close Punctuation
ValueCountFrequency (%)
)7234
100.0%
Dash Punctuation
ValueCountFrequency (%)
-8079
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1769164
80.8%
Common419306
 
19.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
l347709
19.7%
e216121
12.2%
r197469
11.2%
o191466
10.8%
t185801
10.5%
A184755
10.4%
h181076
10.2%
n46256
 
2.6%
a45644
 
2.6%
i40623
 
2.3%
Other values (17)132244
 
7.5%
Common
ValueCountFrequency (%)
396759
94.6%
-8079
 
1.9%
(7234
 
1.7%
)7234
 
1.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII2188470
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
396759
18.1%
l347709
15.9%
e216121
9.9%
r197469
9.0%
o191466
8.7%
t185801
8.5%
A184755
8.4%
h181076
8.3%
n46256
 
2.1%
a45644
 
2.1%
Other values (21)195414
8.9%

sex
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Female
103984 
Male
95539 

Length

Max length7
Median length7
Mean length6.042325947
Min length5

Characters and Unicode

Total characters1205583
Distinct characters7
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Female
2nd row Male
3rd row Female
4th row Female
5th row Female

Common Values

ValueCountFrequency (%)
Female103984
52.1%
Male95539
47.9%

Length

2021-09-08T19:23:21.683574image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-08T19:23:21.754087image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
female103984
52.1%
male95539
47.9%

Most occurring characters

ValueCountFrequency (%)
e303507
25.2%
199523
16.5%
a199523
16.5%
l199523
16.5%
F103984
 
8.6%
m103984
 
8.6%
M95539
 
7.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter806537
66.9%
Space Separator199523
 
16.5%
Uppercase Letter199523
 
16.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e303507
37.6%
a199523
24.7%
l199523
24.7%
m103984
 
12.9%
Uppercase Letter
ValueCountFrequency (%)
F103984
52.1%
M95539
47.9%
Space Separator
ValueCountFrequency (%)
199523
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1006060
83.5%
Common199523
 
16.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
e303507
30.2%
a199523
19.8%
l199523
19.8%
F103984
 
10.3%
m103984
 
10.3%
M95539
 
9.5%
Common
ValueCountFrequency (%)
199523
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1205583
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e303507
25.2%
199523
16.5%
a199523
16.5%
l199523
16.5%
F103984
 
8.6%
m103984
 
8.6%
M95539
 
7.9%

member_of_a_labor_union
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe
180459 
No
 
16034
Yes
 
3030

Length

Max length16
Median length16
Mean length14.77306376
Min length3

Characters and Unicode

Total characters2947566
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe
2nd row Not in universe
3rd row Not in universe
4th row Not in universe
5th row Not in universe

Common Values

ValueCountFrequency (%)
Not in universe180459
90.4%
No16034
 
8.0%
Yes3030
 
1.5%

Length

2021-09-08T19:23:21.971947image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-08T19:23:22.038792image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
not180459
32.2%
in180459
32.2%
universe180459
32.2%
no16034
 
2.9%
yes3030
 
0.5%

Most occurring characters

ValueCountFrequency (%)
560441
19.0%
e363948
12.3%
i360918
12.2%
n360918
12.2%
N196493
 
6.7%
o196493
 
6.7%
s183489
 
6.2%
t180459
 
6.1%
u180459
 
6.1%
v180459
 
6.1%
Other values (2)183489
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2187602
74.2%
Space Separator560441
 
19.0%
Uppercase Letter199523
 
6.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e363948
16.6%
i360918
16.5%
n360918
16.5%
o196493
9.0%
s183489
8.4%
t180459
8.2%
u180459
8.2%
v180459
8.2%
r180459
8.2%
Uppercase Letter
ValueCountFrequency (%)
N196493
98.5%
Y3030
 
1.5%
Space Separator
ValueCountFrequency (%)
560441
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2387125
81.0%
Common560441
 
19.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e363948
15.2%
i360918
15.1%
n360918
15.1%
N196493
8.2%
o196493
8.2%
s183489
7.7%
t180459
7.6%
u180459
7.6%
v180459
7.6%
r180459
7.6%
Common
ValueCountFrequency (%)
560441
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII2947566
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
560441
19.0%
e363948
12.3%
i360918
12.2%
n360918
12.2%
N196493
 
6.7%
o196493
 
6.7%
s183489
 
6.2%
t180459
 
6.1%
u180459
 
6.1%
v180459
 
6.1%
Other values (2)183489
 
6.2%

reason_for_unemployment
Categorical

HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe
193453 
Other job loser
 
2038
Re-entrant
 
2019
Job loser - on layoff
 
976
Job leaver
 
598

Length

Max length22
Median length16
Mean length15.9549676
Min length11

Characters and Unicode

Total characters3183383
Distinct characters23
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe
2nd row Not in universe
3rd row Not in universe
4th row Not in universe
5th row Not in universe

Common Values

ValueCountFrequency (%)
Not in universe193453
97.0%
Other job loser2038
 
1.0%
Re-entrant2019
 
1.0%
Job loser - on layoff976
 
0.5%
Job leaver598
 
0.3%
New entrant439
 
0.2%

Length

2021-09-08T19:23:22.255443image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-08T19:23:22.341931image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
not193453
32.5%
in193453
32.5%
universe193453
32.5%
job3612
 
0.6%
loser3014
 
0.5%
other2038
 
0.3%
re-entrant2019
 
0.3%
976
 
0.2%
on976
 
0.2%
layoff976
 
0.2%
Other values (3)1476
 
0.2%

Most occurring characters

ValueCountFrequency (%)
595446
18.7%
e398070
12.5%
n392798
12.3%
i386906
12.2%
o202031
 
6.3%
r201561
 
6.3%
t200407
 
6.3%
s196467
 
6.2%
v194051
 
6.1%
N193892
 
6.1%
Other values (13)221754
 
7.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2385419
74.9%
Space Separator595446
 
18.7%
Uppercase Letter199523
 
6.3%
Dash Punctuation2995
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e398070
16.7%
n392798
16.5%
i386906
16.2%
o202031
8.5%
r201561
8.4%
t200407
8.4%
s196467
8.2%
v194051
8.1%
u193453
8.1%
l4588
 
0.2%
Other values (7)15087
 
0.6%
Uppercase Letter
ValueCountFrequency (%)
N193892
97.2%
O2038
 
1.0%
R2019
 
1.0%
J1574
 
0.8%
Space Separator
ValueCountFrequency (%)
595446
100.0%
Dash Punctuation
ValueCountFrequency (%)
-2995
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2584942
81.2%
Common598441
 
18.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e398070
15.4%
n392798
15.2%
i386906
15.0%
o202031
7.8%
r201561
7.8%
t200407
7.8%
s196467
7.6%
v194051
7.5%
N193892
7.5%
u193453
7.5%
Other values (11)25306
 
1.0%
Common
ValueCountFrequency (%)
595446
99.5%
-2995
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII3183383
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
595446
18.7%
e398070
12.5%
n392798
12.3%
i386906
12.2%
o202031
 
6.3%
r201561
 
6.3%
t200407
 
6.3%
s196467
 
6.2%
v194051
 
6.1%
N193892
 
6.1%
Other values (13)221754
 
7.0%

full_or_part_time_employment_stat
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Children or Armed Forces
123769 
Full-time schedules
40736 
Not in labor force
26808 
PT for non-econ reasons usually FT
 
3322
Unemployed full-time
 
2311
Other values (3)
 
2577

Length

Max length35
Median length25
Mean length23.33263834
Min length19

Characters and Unicode

Total characters4655398
Distinct characters27
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in labor force
2nd row Children or Armed Forces
3rd row Not in labor force
4th row Children or Armed Forces
5th row Children or Armed Forces

Common Values

ValueCountFrequency (%)
Children or Armed Forces123769
62.0%
Full-time schedules40736
 
20.4%
Not in labor force26808
 
13.4%
PT for non-econ reasons usually FT3322
 
1.7%
Unemployed full-time2311
 
1.2%
PT for econ reasons usually PT1209
 
0.6%
Unemployed part- time843
 
0.4%
PT for econ reasons usually FT525
 
0.3%

Length

2021-09-08T19:23:22.541760image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-08T19:23:22.618269image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
children123769
17.2%
or123769
17.2%
armed123769
17.2%
forces123769
17.2%
full-time43047
 
6.0%
schedules40736
 
5.6%
not26808
 
3.7%
labor26808
 
3.7%
force26808
 
3.7%
in26808
 
3.7%
Other values (10)35176
 
4.9%

Most occurring characters

ValueCountFrequency (%)
721267
15.5%
r559647
12.0%
e539897
11.6%
o349606
 
7.5%
d291428
 
6.3%
l290673
 
6.2%
s220409
 
4.7%
c196369
 
4.2%
i194467
 
4.2%
m170813
 
3.7%
Other values (17)1120822
24.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3424690
73.6%
Space Separator721267
 
15.5%
Uppercase Letter462229
 
9.9%
Dash Punctuation47212
 
1.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r559647
16.3%
e539897
15.8%
o349606
10.2%
d291428
8.5%
l290673
8.5%
s220409
 
6.4%
c196369
 
5.7%
i194467
 
5.7%
m170813
 
5.0%
n170487
 
5.0%
Other values (8)440894
12.9%
Uppercase Letter
ValueCountFrequency (%)
F168352
36.4%
C123769
26.8%
A123769
26.8%
N26808
 
5.8%
T10112
 
2.2%
P6265
 
1.4%
U3154
 
0.7%
Space Separator
ValueCountFrequency (%)
721267
100.0%
Dash Punctuation
ValueCountFrequency (%)
-47212
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin3886919
83.5%
Common768479
 
16.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
r559647
14.4%
e539897
13.9%
o349606
 
9.0%
d291428
 
7.5%
l290673
 
7.5%
s220409
 
5.7%
c196369
 
5.1%
i194467
 
5.0%
m170813
 
4.4%
n170487
 
4.4%
Other values (15)903123
23.2%
Common
ValueCountFrequency (%)
721267
93.9%
-47212
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII4655398
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
721267
15.5%
r559647
12.0%
e539897
11.6%
o349606
 
7.5%
d291428
 
6.3%
l290673
 
6.2%
s220409
 
4.7%
c196369
 
4.2%
i194467
 
4.2%
m170813
 
3.7%
Other values (17)1120822
24.1%

capital_gains
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct132
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean434.7189898
Minimum0
Maximum99999
Zeros192144
Zeros (%)96.3%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2021-09-08T19:23:22.737788image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum99999
Range99999
Interquartile range (IQR)0

Descriptive statistics

Standard deviation4697.53128
Coefficient of variation (CV)10.8059031
Kurtosis393.0628325
Mean434.7189898
Median Absolute Deviation (MAD)0
Skewness18.99082234
Sum86736437
Variance22066800.12
MonotonicityNot monotonic
2021-09-08T19:23:22.882241image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0192144
96.3%
15024788
 
0.4%
7688609
 
0.3%
7298582
 
0.3%
99999390
 
0.2%
3103237
 
0.1%
5178207
 
0.1%
5013158
 
0.1%
4386151
 
0.1%
3325121
 
0.1%
Other values (122)4136
 
2.1%
ValueCountFrequency (%)
0192144
96.3%
11411
 
< 0.1%
40133
 
< 0.1%
59488
 
< 0.1%
91417
 
< 0.1%
99159
 
< 0.1%
105569
 
< 0.1%
108681
 
< 0.1%
10902
 
< 0.1%
11114
 
< 0.1%
ValueCountFrequency (%)
99999390
0.2%
413102
 
< 0.1%
3409511
 
< 0.1%
2782894
 
< 0.1%
2523623
 
< 0.1%
2512418
 
< 0.1%
220402
 
< 0.1%
2005191
 
< 0.1%
1848114
 
< 0.1%
1583116
 
< 0.1%

capital_losses
Real number (ℝ≥0)

HIGH CORRELATION
ZEROS

Distinct113
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean37.31378839
Minimum0
Maximum4608
Zeros195617
Zeros (%)98.0%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2021-09-08T19:23:23.019296image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum4608
Range4608
Interquartile range (IQR)0

Descriptive statistics

Standard deviation271.8964284
Coefficient of variation (CV)7.286754847
Kurtosis61.63293305
Mean37.31378839
Median Absolute Deviation (MAD)0
Skewness7.6325647
Sum7444959
Variance73927.66776
MonotonicityNot monotonic
2021-09-08T19:23:23.152108image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0195617
98.0%
1902407
 
0.2%
1977381
 
0.2%
1887364
 
0.2%
1602193
 
0.1%
2415122
 
0.1%
148595
 
< 0.1%
184888
 
< 0.1%
187687
 
< 0.1%
167285
 
< 0.1%
Other values (103)2084
 
1.0%
ValueCountFrequency (%)
0195617
98.0%
1551
 
< 0.1%
21310
 
< 0.1%
32310
 
< 0.1%
41929
 
< 0.1%
62525
 
< 0.1%
6537
 
< 0.1%
7725
 
< 0.1%
8105
 
< 0.1%
8809
 
< 0.1%
ValueCountFrequency (%)
46084
 
< 0.1%
435630
< 0.1%
39002
 
< 0.1%
37705
 
< 0.1%
36834
 
< 0.1%
350010
 
< 0.1%
31758
 
< 0.1%
300411
 
< 0.1%
282427
< 0.1%
27887
 
< 0.1%

divdends_from_stocks
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED
ZEROS

Distinct1478
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean197.5295329
Minimum0
Maximum99999
Zeros178382
Zeros (%)89.4%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2021-09-08T19:23:23.284973image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile400
Maximum99999
Range99999
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1984.163658
Coefficient of variation (CV)10.04489622
Kurtosis1090.563754
Mean197.5295329
Median Absolute Deviation (MAD)0
Skewness27.78650179
Sum39411685
Variance3936905.423
MonotonicityNot monotonic
2021-09-08T19:23:23.490299image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0178382
89.4%
1001148
 
0.6%
5001030
 
0.5%
1000894
 
0.4%
200866
 
0.4%
50832
 
0.4%
2000574
 
0.3%
250555
 
0.3%
150549
 
0.3%
300523
 
0.3%
Other values (1468)14170
 
7.1%
ValueCountFrequency (%)
0178382
89.4%
1472
 
0.2%
2193
 
0.1%
3129
 
0.1%
475
 
< 0.1%
5179
 
0.1%
6100
 
0.1%
793
 
< 0.1%
894
 
< 0.1%
956
 
< 0.1%
ValueCountFrequency (%)
9999925
< 0.1%
950951
 
< 0.1%
750005
 
< 0.1%
700003
 
< 0.1%
666212
 
< 0.1%
600007
 
< 0.1%
576781
 
< 0.1%
550001
 
< 0.1%
546002
 
< 0.1%
545002
 
< 0.1%

tax_filer_status
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Nonfiler
75094 
Joint both under 65
67383 
Single
37421 
Joint both 65+
8332 
Head of household
 
7426

Length

Max length29
Median length9
Mean length13.31297144
Min length7

Characters and Unicode

Total characters2656244
Distinct characters24
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Nonfiler
2nd row Head of household
3rd row Nonfiler
4th row Nonfiler
5th row Nonfiler

Common Values

ValueCountFrequency (%)
Nonfiler75094
37.6%
Joint both under 6567383
33.8%
Single37421
18.8%
Joint both 65+8332
 
4.2%
Head of household7426
 
3.7%
Joint one under 65 & one 65+3867
 
1.9%

Length

2021-09-08T19:23:23.752497image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-08T19:23:23.826710image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
6583449
18.3%
joint79582
17.4%
both75715
16.6%
nonfiler75094
16.5%
under71250
15.6%
single37421
8.2%
one7734
 
1.7%
head7426
 
1.6%
of7426
 
1.6%
household7426
 
1.6%

Most occurring characters

ValueCountFrequency (%)
456390
17.2%
n271081
10.2%
o260403
 
9.8%
e206351
 
7.8%
i192097
 
7.2%
t155297
 
5.8%
r146344
 
5.5%
l119941
 
4.5%
h90567
 
3.4%
d86102
 
3.2%
Other values (14)671671
25.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1817367
68.4%
Space Separator456390
 
17.2%
Uppercase Letter199523
 
7.5%
Decimal Number166898
 
6.3%
Math Symbol12199
 
0.5%
Other Punctuation3867
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n271081
14.9%
o260403
14.3%
e206351
11.4%
i192097
10.6%
t155297
8.5%
r146344
8.1%
l119941
6.6%
h90567
 
5.0%
d86102
 
4.7%
f82520
 
4.5%
Other values (5)206664
11.4%
Uppercase Letter
ValueCountFrequency (%)
J79582
39.9%
N75094
37.6%
S37421
18.8%
H7426
 
3.7%
Decimal Number
ValueCountFrequency (%)
683449
50.0%
583449
50.0%
Space Separator
ValueCountFrequency (%)
456390
100.0%
Math Symbol
ValueCountFrequency (%)
+12199
100.0%
Other Punctuation
ValueCountFrequency (%)
&3867
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2016890
75.9%
Common639354
 
24.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
n271081
13.4%
o260403
12.9%
e206351
10.2%
i192097
9.5%
t155297
 
7.7%
r146344
 
7.3%
l119941
 
5.9%
h90567
 
4.5%
d86102
 
4.3%
f82520
 
4.1%
Other values (9)406187
20.1%
Common
ValueCountFrequency (%)
456390
71.4%
683449
 
13.1%
583449
 
13.1%
+12199
 
1.9%
&3867
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII2656244
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
456390
17.2%
n271081
10.2%
o260403
 
9.8%
e206351
 
7.8%
i192097
 
7.2%
t155297
 
5.8%
r146344
 
5.5%
l119941
 
4.5%
h90567
 
3.4%
d86102
 
3.2%
Other values (14)671671
25.3%

region_of_previous_residence
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct6
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe
183750 
South
 
4889
West
 
4074
Midwest
 
3575
Northeast
 
2705

Length

Max length16
Median length16
Mean length15.28176701
Min length5

Characters and Unicode

Total characters3049064
Distinct characters20
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe
2nd row South
3rd row Not in universe
4th row Not in universe
5th row Not in universe

Common Values

ValueCountFrequency (%)
Not in universe183750
92.1%
South4889
 
2.5%
West4074
 
2.0%
Midwest3575
 
1.8%
Northeast2705
 
1.4%
Abroad530
 
0.3%

Length

2021-09-08T19:23:24.061808image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-08T19:23:24.137388image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
not183750
32.4%
in183750
32.4%
universe183750
32.4%
south4889
 
0.9%
west4074
 
0.7%
midwest3575
 
0.6%
northeast2705
 
0.5%
abroad530
 
0.1%

Most occurring characters

ValueCountFrequency (%)
567023
18.6%
e377854
12.4%
i371075
12.2%
n367500
12.1%
t201698
 
6.6%
s194104
 
6.4%
o191874
 
6.3%
u188639
 
6.2%
r186985
 
6.1%
N186455
 
6.1%
Other values (10)215857
 
7.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2282518
74.9%
Space Separator567023
 
18.6%
Uppercase Letter199523
 
6.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e377854
16.6%
i371075
16.3%
n367500
16.1%
t201698
8.8%
s194104
8.5%
o191874
8.4%
u188639
8.3%
r186985
8.2%
v183750
8.1%
h7594
 
0.3%
Other values (4)11445
 
0.5%
Uppercase Letter
ValueCountFrequency (%)
N186455
93.5%
S4889
 
2.5%
W4074
 
2.0%
M3575
 
1.8%
A530
 
0.3%
Space Separator
ValueCountFrequency (%)
567023
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2482041
81.4%
Common567023
 
18.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e377854
15.2%
i371075
15.0%
n367500
14.8%
t201698
8.1%
s194104
7.8%
o191874
7.7%
u188639
7.6%
r186985
7.5%
N186455
7.5%
v183750
7.4%
Other values (9)32107
 
1.3%
Common
ValueCountFrequency (%)
567023
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII3049064
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
567023
18.6%
e377854
12.4%
i371075
12.2%
n367500
12.1%
t201698
 
6.6%
s194104
 
6.4%
o191874
 
6.3%
u188639
 
6.2%
r186985
 
6.1%
N186455
 
6.1%
Other values (10)215857
 
7.1%

state_of_previous_residence
Categorical

HIGH CARDINALITY
HIGH CORRELATION
HIGH CORRELATION

Distinct51
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe
183750 
California
 
1714
Utah
 
1063
Florida
 
849
North Carolina
 
812
Other values (46)
 
11335

Length

Max length21
Median length16
Mean length15.45687465
Min length2

Characters and Unicode

Total characters3084002
Distinct characters46
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe
2nd row Arkansas
3rd row Not in universe
4th row Not in universe
5th row Not in universe

Common Values

ValueCountFrequency (%)
Not in universe183750
92.1%
California1714
 
0.9%
Utah1063
 
0.5%
Florida849
 
0.4%
North Carolina812
 
0.4%
?708
 
0.4%
Abroad671
 
0.3%
Oklahoma626
 
0.3%
Minnesota576
 
0.3%
Indiana533
 
0.3%
Other values (41)8221
 
4.1%

Length

2021-09-08T19:23:24.418325image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
not183750
32.2%
in183750
32.2%
universe183750
32.2%
california1714
 
0.3%
north1311
 
0.2%
utah1063
 
0.2%
new975
 
0.2%
carolina907
 
0.2%
florida849
 
0.1%
708
 
0.1%
Other values (46)11228
 
2.0%

Most occurring characters

ValueCountFrequency (%)
570005
18.5%
i380324
12.3%
n377218
12.2%
e373184
12.1%
o195445
 
6.3%
r192090
 
6.2%
s189330
 
6.1%
t189230
 
6.1%
N186388
 
6.0%
u184978
 
6.0%
Other values (36)245810
8.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2311608
75.0%
Space Separator570005
 
18.5%
Uppercase Letter201681
 
6.5%
Other Punctuation708
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i380324
16.5%
n377218
16.3%
e373184
16.1%
o195445
8.5%
r192090
8.3%
s189330
8.2%
t189230
8.2%
u184978
8.0%
v184123
8.0%
a19048
 
0.8%
Other values (14)26638
 
1.2%
Uppercase Letter
ValueCountFrequency (%)
N186388
92.4%
C3093
 
1.5%
M2539
 
1.3%
A1625
 
0.8%
O1073
 
0.5%
U1063
 
0.5%
I933
 
0.5%
F849
 
0.4%
D826
 
0.4%
W577
 
0.3%
Other values (10)2715
 
1.3%
Space Separator
ValueCountFrequency (%)
570005
100.0%
Other Punctuation
ValueCountFrequency (%)
?708
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2513289
81.5%
Common570713
 
18.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
i380324
15.1%
n377218
15.0%
e373184
14.8%
o195445
7.8%
r192090
7.6%
s189330
7.5%
t189230
7.5%
N186388
7.4%
u184978
7.4%
v184123
7.3%
Other values (34)60979
 
2.4%
Common
ValueCountFrequency (%)
570005
99.9%
?708
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII3084002
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
570005
18.5%
i380324
12.3%
n377218
12.2%
e373184
12.1%
o195445
 
6.3%
r192090
 
6.2%
s189330
 
6.1%
t189230
 
6.1%
N186388
 
6.0%
u184978
 
6.0%
Other values (36)245810
8.0%

detailed_household_and_family_stat
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct38
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Householder
53248 
Child <18 never marr not in subfamily
50326 
Spouse of householder
41695 
Nonfamily householder
22213 
Child 18+ never marr Not in a subfamily
12030 
Other values (33)
20011 

Length

Max length48
Median length22
Mean length25.71388762
Min length12

Characters and Unicode

Total characters5130512
Distinct characters35
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st row Other Rel 18+ ever marr not in subfamily
2nd row Householder
3rd row Child 18+ never marr Not in a subfamily
4th row Child <18 never marr not in subfamily
5th row Child <18 never marr not in subfamily

Common Values

ValueCountFrequency (%)
Householder53248
26.7%
Child <18 never marr not in subfamily50326
25.2%
Spouse of householder41695
20.9%
Nonfamily householder22213
11.1%
Child 18+ never marr Not in a subfamily12030
 
6.0%
Secondary individual6122
 
3.1%
Other Rel 18+ ever marr not in subfamily1956
 
1.0%
Grandchild <18 never marr child of subfamily RP1868
 
0.9%
Other Rel 18+ never marr not in subfamily1728
 
0.9%
Grandchild <18 never marr not in subfamily1066
 
0.5%
Other values (28)7271
 
3.6%

Length

2021-09-08T19:23:24.679600image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
householder117156
14.9%
subfamily76049
9.7%
1875312
9.6%
marr73797
9.4%
never69408
8.8%
in69347
8.8%
not69151
8.8%
child68138
8.7%
of49377
6.3%
spouse42526
 
5.4%
Other values (15)77412
9.8%

Most occurring characters

ValueCountFrequency (%)
787673
15.4%
e446352
 
8.7%
o423897
 
8.3%
r357168
 
7.0%
l300845
 
5.9%
h258900
 
5.0%
i257293
 
5.0%
u244446
 
4.8%
s236706
 
4.6%
n234893
 
4.6%
Other values (25)1582339
30.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3885632
75.7%
Space Separator787673
 
15.4%
Uppercase Letter232003
 
4.5%
Decimal Number150624
 
2.9%
Math Symbol74580
 
1.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e446352
11.5%
o423897
10.9%
r357168
 
9.2%
l300845
 
7.7%
h258900
 
6.7%
i257293
 
6.6%
u244446
 
6.3%
s236706
 
6.1%
n234893
 
6.0%
d211877
 
5.5%
Other values (11)913255
23.5%
Uppercase Letter
ValueCountFrequency (%)
C65614
28.3%
H53248
23.0%
S47869
20.6%
N35256
15.2%
R13224
 
5.7%
P6898
 
3.0%
O6326
 
2.7%
G3372
 
1.5%
I196
 
0.1%
Decimal Number
ValueCountFrequency (%)
175312
50.0%
875312
50.0%
Math Symbol
ValueCountFrequency (%)
<54645
73.3%
+19935
 
26.7%
Space Separator
ValueCountFrequency (%)
787673
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin4117635
80.3%
Common1012877
 
19.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e446352
 
10.8%
o423897
 
10.3%
r357168
 
8.7%
l300845
 
7.3%
h258900
 
6.3%
i257293
 
6.2%
u244446
 
5.9%
s236706
 
5.7%
n234893
 
5.7%
d211877
 
5.1%
Other values (20)1145258
27.8%
Common
ValueCountFrequency (%)
787673
77.8%
175312
 
7.4%
875312
 
7.4%
<54645
 
5.4%
+19935
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII5130512
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
787673
15.4%
e446352
 
8.7%
o423897
 
8.3%
r357168
 
7.0%
l300845
 
5.9%
h258900
 
5.0%
i257293
 
5.0%
u244446
 
4.8%
s236706
 
4.6%
n234893
 
4.6%
Other values (25)1582339
30.8%

detailed_household_summary_in_household
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Householder
75475 
Child under 18 never married
50426 
Spouse of householder
41709 
Child 18 or older
14430 
Other relative of householder
9703 
Other values (3)
7780 

Length

Max length37
Median length22
Mean length20.28793172
Min length12

Characters and Unicode

Total characters4047909
Distinct characters29
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Other relative of householder
2nd row Householder
3rd row Child 18 or older
4th row Child under 18 never married
5th row Child under 18 never married

Common Values

ValueCountFrequency (%)
Householder75475
37.8%
Child under 18 never married50426
25.3%
Spouse of householder41709
20.9%
Child 18 or older14430
 
7.2%
Other relative of householder9703
 
4.9%
Nonrelative of householder7601
 
3.8%
Group Quarters- Secondary individual132
 
0.1%
Child under 18 ever married47
 
< 0.1%

Length

2021-09-08T19:23:24.919601image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-08T19:23:24.994318image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
householder134488
23.5%
child64903
11.3%
1864903
11.3%
of59013
10.3%
married50473
 
8.8%
under50473
 
8.8%
never50426
 
8.8%
spouse41709
 
7.3%
or14430
 
2.5%
older14430
 
2.5%
Other values (8)27582
 
4.8%

Most occurring characters

ValueCountFrequency (%)
572830
14.2%
e571582
14.1%
o406423
10.0%
r392775
9.7%
d315163
7.8%
h268107
 
6.6%
l231257
 
5.7%
u227066
 
5.6%
s176329
 
4.4%
i133076
 
3.3%
Other values (19)753301
18.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter3145354
77.7%
Space Separator572830
 
14.2%
Uppercase Letter199787
 
4.9%
Decimal Number129806
 
3.2%
Dash Punctuation132
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e571582
18.2%
o406423
12.9%
r392775
12.5%
d315163
10.0%
h268107
8.5%
l231257
7.4%
u227066
 
7.2%
s176329
 
5.6%
i133076
 
4.2%
n108764
 
3.5%
Other values (8)314812
10.0%
Uppercase Letter
ValueCountFrequency (%)
H75475
37.8%
C64903
32.5%
S41841
20.9%
O9703
 
4.9%
N7601
 
3.8%
G132
 
0.1%
Q132
 
0.1%
Decimal Number
ValueCountFrequency (%)
164903
50.0%
864903
50.0%
Space Separator
ValueCountFrequency (%)
572830
100.0%
Dash Punctuation
ValueCountFrequency (%)
-132
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin3345141
82.6%
Common702768
 
17.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e571582
17.1%
o406423
12.1%
r392775
11.7%
d315163
9.4%
h268107
8.0%
l231257
6.9%
u227066
 
6.8%
s176329
 
5.3%
i133076
 
4.0%
n108764
 
3.3%
Other values (15)514599
15.4%
Common
ValueCountFrequency (%)
572830
81.5%
164903
 
9.2%
864903
 
9.2%
-132
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII4047909
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
572830
14.2%
e571582
14.1%
o406423
10.0%
r392775
9.7%
d315163
7.8%
h268107
 
6.6%
l231257
 
5.7%
u227066
 
5.6%
s176329
 
4.4%
i133076
 
3.3%
Other values (19)753301
18.6%

instance_weight
Real number (ℝ≥0)

HIGH CORRELATION

Distinct99800
Distinct (%)50.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1740.380269
Minimum37.87
Maximum18656.3
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2021-09-08T19:23:25.118498image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum37.87
5-th percentile395.342
Q11061.615
median1618.31
Q32188.61
95-th percentile3585.909
Maximum18656.3
Range18618.43
Interquartile range (IQR)1126.995

Descriptive statistics

Standard deviation993.7681558
Coefficient of variation (CV)0.5710063331
Kurtosis5.412514036
Mean1740.380269
Median Absolute Deviation (MAD)561.46
Skewness1.432733152
Sum347245892.5
Variance987575.1475
MonotonicityNot monotonic
2021-09-08T19:23:25.245401image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1601.432
 
< 0.1%
753.2332
 
< 0.1%
1191.2132
 
< 0.1%
1787.3432
 
< 0.1%
1317.5131
 
< 0.1%
707.931
 
< 0.1%
1070.1530
 
< 0.1%
1009.3928
 
< 0.1%
1002.0228
 
< 0.1%
1839.1928
 
< 0.1%
Other values (99790)199219
99.8%
ValueCountFrequency (%)
37.871
 
< 0.1%
39.111
 
< 0.1%
40.672
 
< 0.1%
42.822
 
< 0.1%
43.263
< 0.1%
45.742
 
< 0.1%
47.836
< 0.1%
49.822
 
< 0.1%
52.431
 
< 0.1%
52.464
< 0.1%
ValueCountFrequency (%)
18656.31
< 0.1%
16349.21
< 0.1%
13911.51
< 0.1%
13145.11
< 0.1%
13114.21
< 0.1%
12960.21
< 0.1%
12399.91
< 0.1%
12184.51
< 0.1%
11958.41
< 0.1%
118631
< 0.1%

migration_code_change_in_msa
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
?
99696 
Nonmover
82538 
MSA to MSA
10601 
NonMSA to nonMSA
 
2811
Not in universe
 
1516
Other values (5)
 
2361

Length

Max length17
Median length9
Mean length5.841186229
Min length2

Characters and Unicode

Total characters1165451
Distinct characters21
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row ?
2nd row MSA to MSA
3rd row ?
4th row Nonmover
5th row Nonmover

Common Values

ValueCountFrequency (%)
?99696
50.0%
Nonmover82538
41.4%
MSA to MSA10601
 
5.3%
NonMSA to nonMSA2811
 
1.4%
Not in universe1516
 
0.8%
MSA to nonMSA790
 
0.4%
NonMSA to MSA615
 
0.3%
Abroad to MSA453
 
0.2%
Not identifiable430
 
0.2%
Abroad to nonMSA73
 
< 0.1%

Length

2021-09-08T19:23:25.947573image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-08T19:23:26.034129image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
99696
42.7%
nonmover82538
35.3%
msa23060
 
9.9%
to15343
 
6.6%
nonmsa7100
 
3.0%
not1946
 
0.8%
in1516
 
0.6%
universe1516
 
0.6%
abroad526
 
0.2%
identifiable430
 
0.2%

Most occurring characters

ValueCountFrequency (%)
233671
20.0%
o189991
16.3%
?99696
8.6%
n96774
8.3%
N87910
 
7.5%
e86430
 
7.4%
r84580
 
7.3%
v84054
 
7.2%
m82538
 
7.1%
A30686
 
2.6%
Other values (11)89121
 
7.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter653168
56.0%
Space Separator233671
 
20.0%
Uppercase Letter178916
 
15.4%
Other Punctuation99696
 
8.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o189991
29.1%
n96774
14.8%
e86430
13.2%
r84580
12.9%
v84054
12.9%
m82538
12.6%
t17719
 
2.7%
i4322
 
0.7%
u1516
 
0.2%
s1516
 
0.2%
Other values (5)3728
 
0.6%
Uppercase Letter
ValueCountFrequency (%)
N87910
49.1%
A30686
 
17.2%
M30160
 
16.9%
S30160
 
16.9%
Space Separator
ValueCountFrequency (%)
233671
100.0%
Other Punctuation
ValueCountFrequency (%)
?99696
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin832084
71.4%
Common333367
28.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
o189991
22.8%
n96774
11.6%
N87910
10.6%
e86430
10.4%
r84580
10.2%
v84054
10.1%
m82538
9.9%
A30686
 
3.7%
M30160
 
3.6%
S30160
 
3.6%
Other values (9)28801
 
3.5%
Common
ValueCountFrequency (%)
233671
70.1%
?99696
29.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII1165451
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
233671
20.0%
o189991
16.3%
?99696
8.6%
n96774
8.3%
N87910
 
7.5%
e86430
 
7.4%
r84580
 
7.3%
v84054
 
7.2%
m82538
 
7.1%
A30686
 
2.6%
Other values (11)89121
 
7.6%

migration_code_change_in_reg
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
?
99696 
Nonmover
82538 
Same county
 
9812
Different county same state
 
2797
Not in universe
 
1516
Other values (4)
 
3164

Length

Max length31
Median length7
Mean length6.166862968
Min length2

Characters and Unicode

Total characters1230431
Distinct characters23
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row ?
2nd row Same county
3rd row ?
4th row Nonmover
5th row Nonmover

Common Values

ValueCountFrequency (%)
?99696
50.0%
Nonmover82538
41.4%
Same county9812
 
4.9%
Different county same state2797
 
1.4%
Not in universe1516
 
0.8%
Different region1178
 
0.6%
Different state same division991
 
0.5%
Abroad530
 
0.3%
Different division same region465
 
0.2%

Length

2021-09-08T19:23:26.262422image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-08T19:23:26.343376image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
99696
44.1%
nonmover82538
36.5%
same14065
 
6.2%
county12609
 
5.6%
different5431
 
2.4%
state3788
 
1.7%
region1643
 
0.7%
not1516
 
0.7%
in1516
 
0.7%
universe1516
 
0.7%
Other values (2)1986
 
0.9%

Most occurring characters

ValueCountFrequency (%)
226304
18.4%
o182830
14.9%
e115928
9.4%
n106709
8.7%
?99696
8.1%
m96603
7.9%
r91658
7.4%
v85510
 
6.9%
N84054
 
6.8%
t27132
 
2.2%
Other values (13)114007
9.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter804604
65.4%
Space Separator226304
 
18.4%
Uppercase Letter99827
 
8.1%
Other Punctuation99696
 
8.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o182830
22.7%
e115928
14.4%
n106709
13.3%
m96603
12.0%
r91658
11.4%
v85510
10.6%
t27132
 
3.4%
a18383
 
2.3%
i14474
 
1.8%
u14125
 
1.8%
Other values (7)51252
 
6.4%
Uppercase Letter
ValueCountFrequency (%)
N84054
84.2%
S9812
 
9.8%
D5431
 
5.4%
A530
 
0.5%
Space Separator
ValueCountFrequency (%)
226304
100.0%
Other Punctuation
ValueCountFrequency (%)
?99696
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin904431
73.5%
Common326000
 
26.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o182830
20.2%
e115928
12.8%
n106709
11.8%
m96603
10.7%
r91658
10.1%
v85510
9.5%
N84054
9.3%
t27132
 
3.0%
a18383
 
2.0%
i14474
 
1.6%
Other values (11)81150
9.0%
Common
ValueCountFrequency (%)
226304
69.4%
?99696
30.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII1230431
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
226304
18.4%
o182830
14.9%
e115928
9.4%
n106709
8.7%
?99696
8.1%
m96603
7.9%
r91658
7.4%
v85510
 
6.9%
N84054
 
6.8%
t27132
 
2.2%
Other values (13)114007
9.3%

migration_code_move_within_reg
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
?
99696 
Nonmover
82538 
Same county
 
9812
Different county same state
 
2797
Not in universe
 
1516
Other values (5)
 
3164

Length

Max length29
Median length7
Mean length6.186038702
Min length2

Characters and Unicode

Total characters1234257
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row ?
2nd row Same county
3rd row ?
4th row Nonmover
5th row Nonmover

Common Values

ValueCountFrequency (%)
?99696
50.0%
Nonmover82538
41.4%
Same county9812
 
4.9%
Different county same state2797
 
1.4%
Not in universe1516
 
0.8%
Different state in South973
 
0.5%
Different state in West679
 
0.3%
Different state in Midwest551
 
0.3%
Abroad530
 
0.3%
Different state in Northeast431
 
0.2%

Length

2021-09-08T19:23:26.615682image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-08T19:23:26.701599image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
99696
43.6%
nonmover82538
36.1%
same12609
 
5.5%
county12609
 
5.5%
different5431
 
2.4%
state5431
 
2.4%
in4150
 
1.8%
not1516
 
0.7%
universe1516
 
0.7%
south973
 
0.4%
Other values (4)2191
 
1.0%

Most occurring characters

ValueCountFrequency (%)
228660
18.5%
o181135
14.7%
e116133
9.4%
n106244
8.6%
?99696
8.1%
m95147
7.7%
r90446
 
7.3%
N84485
 
6.8%
v84054
 
6.8%
t33483
 
2.7%
Other values (16)114774
9.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter803440
65.1%
Space Separator228660
 
18.5%
Uppercase Letter102461
 
8.3%
Other Punctuation99696
 
8.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o181135
22.5%
e116133
14.5%
n106244
13.2%
m95147
11.8%
r90446
11.3%
v84054
10.5%
t33483
 
4.2%
a19001
 
2.4%
u15098
 
1.9%
c12609
 
1.6%
Other values (8)50090
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
N84485
82.5%
S10785
 
10.5%
D5431
 
5.3%
W679
 
0.7%
M551
 
0.5%
A530
 
0.5%
Space Separator
ValueCountFrequency (%)
228660
100.0%
Other Punctuation
ValueCountFrequency (%)
?99696
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin905901
73.4%
Common328356
 
26.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
o181135
20.0%
e116133
12.8%
n106244
11.7%
m95147
10.5%
r90446
10.0%
N84485
9.3%
v84054
9.3%
t33483
 
3.7%
a19001
 
2.1%
u15098
 
1.7%
Other values (14)80675
8.9%
Common
ValueCountFrequency (%)
228660
69.6%
?99696
30.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII1234257
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
228660
18.5%
o181135
14.7%
e116133
9.4%
n106244
8.6%
?99696
8.1%
m95147
7.7%
r90446
 
7.3%
N84485
 
6.8%
v84054
 
6.8%
t33483
 
2.7%
Other values (16)114774
9.3%

live_in_this_house_1_year_ago
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe under 1 year old
101212 
Yes
82538 
No
15773 

Length

Max length33
Median length33
Mean length18.63177178
Min length3

Characters and Unicode

Total characters3717467
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe under 1 year old
2nd row No
3rd row Not in universe under 1 year old
4th row Yes
5th row Yes

Common Values

ValueCountFrequency (%)
Not in universe under 1 year old101212
50.7%
Yes82538
41.4%
No15773
 
7.9%

Length

2021-09-08T19:23:26.940090image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-08T19:23:27.011403image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
not101212
12.5%
in101212
12.5%
universe101212
12.5%
under101212
12.5%
1101212
12.5%
year101212
12.5%
old101212
12.5%
yes82538
10.2%
no15773
 
2.0%

Most occurring characters

ValueCountFrequency (%)
806795
21.7%
e487386
13.1%
n303636
 
8.2%
r303636
 
8.2%
o218197
 
5.9%
i202424
 
5.4%
u202424
 
5.4%
d202424
 
5.4%
s183750
 
4.9%
N116985
 
3.1%
Other values (7)689810
18.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2609937
70.2%
Space Separator806795
 
21.7%
Uppercase Letter199523
 
5.4%
Decimal Number101212
 
2.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e487386
18.7%
n303636
11.6%
r303636
11.6%
o218197
8.4%
i202424
7.8%
u202424
7.8%
d202424
7.8%
s183750
 
7.0%
t101212
 
3.9%
v101212
 
3.9%
Other values (3)303636
11.6%
Uppercase Letter
ValueCountFrequency (%)
N116985
58.6%
Y82538
41.4%
Space Separator
ValueCountFrequency (%)
806795
100.0%
Decimal Number
ValueCountFrequency (%)
1101212
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2809460
75.6%
Common908007
 
24.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e487386
17.3%
n303636
10.8%
r303636
10.8%
o218197
7.8%
i202424
7.2%
u202424
7.2%
d202424
7.2%
s183750
 
6.5%
N116985
 
4.2%
t101212
 
3.6%
Other values (5)487386
17.3%
Common
ValueCountFrequency (%)
806795
88.9%
1101212
 
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII3717467
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
806795
21.7%
e487386
13.1%
n303636
 
8.2%
r303636
 
8.2%
o218197
 
5.9%
i202424
 
5.4%
u202424
 
5.4%
d202424
 
5.4%
s183750
 
4.9%
N116985
 
3.1%
Other values (7)689810
18.6%

migration_prev_res_in_sunbelt
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
?
99696 
Not in universe
84054 
No
9987 
Yes
 
5786

Length

Max length16
Median length3
Mean length8.005899069
Min length2

Characters and Unicode

Total characters1597361
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row ?
2nd row Yes
3rd row ?
4th row Not in universe
5th row Not in universe

Common Values

ValueCountFrequency (%)
?99696
50.0%
Not in universe84054
42.1%
No9987
 
5.0%
Yes5786
 
2.9%

Length

2021-09-08T19:23:27.200220image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-08T19:23:27.283577image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
99696
27.1%
not84054
22.9%
in84054
22.9%
universe84054
22.9%
no9987
 
2.7%
yes5786
 
1.6%

Most occurring characters

ValueCountFrequency (%)
367631
23.0%
e173894
10.9%
i168108
10.5%
n168108
10.5%
?99696
 
6.2%
N94041
 
5.9%
o94041
 
5.9%
s89840
 
5.6%
t84054
 
5.3%
u84054
 
5.3%
Other values (3)173894
10.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1030207
64.5%
Space Separator367631
 
23.0%
Uppercase Letter99827
 
6.2%
Other Punctuation99696
 
6.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e173894
16.9%
i168108
16.3%
n168108
16.3%
o94041
9.1%
s89840
8.7%
t84054
8.2%
u84054
8.2%
v84054
8.2%
r84054
8.2%
Uppercase Letter
ValueCountFrequency (%)
N94041
94.2%
Y5786
 
5.8%
Space Separator
ValueCountFrequency (%)
367631
100.0%
Other Punctuation
ValueCountFrequency (%)
?99696
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1130034
70.7%
Common467327
29.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e173894
15.4%
i168108
14.9%
n168108
14.9%
N94041
8.3%
o94041
8.3%
s89840
8.0%
t84054
7.4%
u84054
7.4%
v84054
7.4%
r84054
7.4%
Common
ValueCountFrequency (%)
367631
78.7%
?99696
 
21.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII1597361
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
367631
23.0%
e173894
10.9%
i168108
10.5%
n168108
10.5%
?99696
 
6.2%
N94041
 
5.9%
o94041
 
5.9%
s89840
 
5.6%
t84054
 
5.3%
u84054
 
5.3%
Other values (3)173894
10.9%

num_persons_worked_for_employer
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.95618049
Minimum0
Maximum6
Zeros95983
Zeros (%)48.1%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2021-09-08T19:23:27.354857image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q34
95-th percentile6
Maximum6
Range6
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.365125505
Coefficient of variation (CV)1.209052803
Kurtosis-1.082246833
Mean1.95618049
Median Absolute Deviation (MAD)1
Skewness0.7515606804
Sum390303
Variance5.593818657
MonotonicityNot monotonic
2021-09-08T19:23:27.440491image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
095983
48.1%
636511
 
18.3%
123109
 
11.6%
414379
 
7.2%
313425
 
6.7%
210081
 
5.1%
56035
 
3.0%
ValueCountFrequency (%)
095983
48.1%
123109
 
11.6%
210081
 
5.1%
313425
 
6.7%
414379
 
7.2%
56035
 
3.0%
636511
 
18.3%
ValueCountFrequency (%)
636511
 
18.3%
56035
 
3.0%
414379
 
7.2%
313425
 
6.7%
210081
 
5.1%
123109
 
11.6%
095983
48.1%

family_members_under_18
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe
144232 
Both parents present
38983 
Mother only present
 
12772
Father only present
 
1883
Neither parent present
 
1653

Length

Max length23
Median length16
Mean length17.32869895
Min length16

Characters and Unicode

Total characters3457474
Distinct characters19
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe
2nd row Not in universe
3rd row Not in universe
4th row Both parents present
5th row Both parents present

Common Values

ValueCountFrequency (%)
Not in universe144232
72.3%
Both parents present38983
 
19.5%
Mother only present12772
 
6.4%
Father only present1883
 
0.9%
Neither parent present1653
 
0.8%

Length

2021-09-08T19:23:27.651305image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-08T19:23:27.721960image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
not144232
24.1%
in144232
24.1%
universe144232
24.1%
present55291
 
9.2%
both38983
 
6.5%
parents38983
 
6.5%
only14655
 
2.4%
mother12772
 
2.1%
father1883
 
0.3%
neither1653
 
0.3%

Most occurring characters

ValueCountFrequency (%)
598569
17.3%
e457643
13.2%
n399046
11.5%
t295450
8.5%
i290117
8.4%
r256467
7.4%
s238506
 
6.9%
o210642
 
6.1%
N145885
 
4.2%
u144232
 
4.2%
Other values (9)420917
12.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2659382
76.9%
Space Separator598569
 
17.3%
Uppercase Letter199523
 
5.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e457643
17.2%
n399046
15.0%
t295450
11.1%
i290117
10.9%
r256467
9.6%
s238506
9.0%
o210642
7.9%
u144232
 
5.4%
v144232
 
5.4%
p95927
 
3.6%
Other values (4)127120
 
4.8%
Uppercase Letter
ValueCountFrequency (%)
N145885
73.1%
B38983
 
19.5%
M12772
 
6.4%
F1883
 
0.9%
Space Separator
ValueCountFrequency (%)
598569
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2858905
82.7%
Common598569
 
17.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e457643
16.0%
n399046
14.0%
t295450
10.3%
i290117
10.1%
r256467
9.0%
s238506
8.3%
o210642
7.4%
N145885
 
5.1%
u144232
 
5.0%
v144232
 
5.0%
Other values (8)276685
9.7%
Common
ValueCountFrequency (%)
598569
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII3457474
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
598569
17.3%
e457643
13.2%
n399046
11.5%
t295450
8.5%
i290117
8.4%
r256467
7.4%
s238506
 
6.9%
o210642
 
6.1%
N145885
 
4.2%
u144232
 
4.2%
Other values (9)420917
12.2%

country_of_birth_father
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct43
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
United-States
159163 
Mexico
 
10008
?
 
6713
Puerto-Rico
 
2680
Italy
 
2212
Other values (38)
18747 

Length

Max length29
Median length14
Mean length12.66875999
Min length2

Characters and Unicode

Total characters2527709
Distinct characters47
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row United-States
2nd row United-States
3rd row Vietnam
4th row United-States
5th row United-States

Common Values

ValueCountFrequency (%)
United-States159163
79.8%
Mexico10008
 
5.0%
?6713
 
3.4%
Puerto-Rico2680
 
1.3%
Italy2212
 
1.1%
Canada1380
 
0.7%
Germany1356
 
0.7%
Dominican-Republic1290
 
0.6%
Poland1212
 
0.6%
Philippines1154
 
0.6%
Other values (33)12355
 
6.2%

Length

2021-09-08T19:23:27.961121image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
united-states159163
79.3%
mexico10008
 
5.0%
6713
 
3.3%
puerto-rico2680
 
1.3%
italy2212
 
1.1%
canada1380
 
0.7%
germany1356
 
0.7%
dominican-republic1290
 
0.6%
poland1212
 
0.6%
philippines1154
 
0.6%
Other values (39)13627
 
6.8%

Most occurring characters

ValueCountFrequency (%)
t485168
19.2%
e338573
13.4%
200795
7.9%
a185809
 
7.4%
i184161
 
7.3%
n173312
 
6.9%
d166069
 
6.6%
-164325
 
6.5%
S161240
 
6.4%
s160933
 
6.4%
Other values (37)307324
12.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1796607
71.1%
Uppercase Letter358838
 
14.2%
Space Separator200795
 
7.9%
Dash Punctuation164325
 
6.5%
Other Punctuation6826
 
0.3%
Open Punctuation159
 
< 0.1%
Close Punctuation159
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t485168
27.0%
e338573
18.8%
a185809
 
10.3%
i184161
 
10.3%
n173312
 
9.6%
d166069
 
9.2%
s160933
 
9.0%
o22790
 
1.3%
c17366
 
1.0%
l11412
 
0.6%
Other values (11)51014
 
2.8%
Uppercase Letter
ValueCountFrequency (%)
S161240
44.9%
U159481
44.4%
M10008
 
2.8%
P5794
 
1.6%
C4171
 
1.2%
R3970
 
1.1%
I3692
 
1.0%
G2304
 
0.6%
E2154
 
0.6%
D1290
 
0.4%
Other values (10)4734
 
1.3%
Other Punctuation
ValueCountFrequency (%)
?6713
98.3%
&113
 
1.7%
Space Separator
ValueCountFrequency (%)
200795
100.0%
Dash Punctuation
ValueCountFrequency (%)
-164325
100.0%
Open Punctuation
ValueCountFrequency (%)
(159
100.0%
Close Punctuation
ValueCountFrequency (%)
)159
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2155445
85.3%
Common372264
 
14.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
t485168
22.5%
e338573
15.7%
a185809
 
8.6%
i184161
 
8.5%
n173312
 
8.0%
d166069
 
7.7%
S161240
 
7.5%
s160933
 
7.5%
U159481
 
7.4%
o22790
 
1.1%
Other values (31)117909
 
5.5%
Common
ValueCountFrequency (%)
200795
53.9%
-164325
44.1%
?6713
 
1.8%
(159
 
< 0.1%
)159
 
< 0.1%
&113
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII2527709
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t485168
19.2%
e338573
13.4%
200795
7.9%
a185809
 
7.4%
i184161
 
7.3%
n173312
 
6.9%
d166069
 
6.6%
-164325
 
6.5%
S161240
 
6.4%
s160933
 
6.4%
Other values (37)307324
12.2%

country_of_birth_mother
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct43
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
United-States
160479 
Mexico
 
9781
?
 
6119
Puerto-Rico
 
2473
Italy
 
1844
Other values (38)
18827 

Length

Max length29
Median length14
Mean length12.72127023
Min length2

Characters and Unicode

Total characters2538186
Distinct characters47
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row United-States
2nd row United-States
3rd row Vietnam
4th row United-States
5th row United-States

Common Values

ValueCountFrequency (%)
United-States160479
80.4%
Mexico9781
 
4.9%
?6119
 
3.1%
Puerto-Rico2473
 
1.2%
Italy1844
 
0.9%
Canada1451
 
0.7%
Germany1382
 
0.7%
Philippines1231
 
0.6%
Poland1110
 
0.6%
El-Salvador1108
 
0.6%
Other values (33)12545
 
6.3%

Length

2021-09-08T19:23:28.220295image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
united-states160479
79.9%
mexico9781
 
4.9%
6119
 
3.0%
puerto-rico2473
 
1.2%
italy1844
 
0.9%
canada1451
 
0.7%
germany1382
 
0.7%
philippines1231
 
0.6%
poland1110
 
0.6%
el-salvador1108
 
0.6%
Other values (39)13889
 
6.9%

Most occurring characters

ValueCountFrequency (%)
t488579
19.2%
e340658
13.4%
200867
7.9%
a187061
 
7.4%
i184556
 
7.3%
n174658
 
6.9%
d167641
 
6.6%
-165369
 
6.5%
S162751
 
6.4%
s162309
 
6.4%
Other values (37)303737
12.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1804888
71.1%
Uppercase Letter360530
 
14.2%
Space Separator200867
 
7.9%
Dash Punctuation165369
 
6.5%
Other Punctuation6218
 
0.2%
Open Punctuation157
 
< 0.1%
Close Punctuation157
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t488579
27.1%
e340658
18.9%
a187061
 
10.4%
i184556
 
10.2%
n174658
 
9.7%
d167641
 
9.3%
s162309
 
9.0%
o22004
 
1.2%
c16460
 
0.9%
l11200
 
0.6%
Other values (11)49762
 
2.8%
Uppercase Letter
ValueCountFrequency (%)
S162751
45.1%
U160793
44.6%
M9781
 
2.7%
P5543
 
1.5%
C4088
 
1.1%
R3576
 
1.0%
I3379
 
0.9%
E2386
 
0.7%
G2244
 
0.6%
D1103
 
0.3%
Other values (10)4886
 
1.4%
Other Punctuation
ValueCountFrequency (%)
?6119
98.4%
&99
 
1.6%
Space Separator
ValueCountFrequency (%)
200867
100.0%
Dash Punctuation
ValueCountFrequency (%)
-165369
100.0%
Open Punctuation
ValueCountFrequency (%)
(157
100.0%
Close Punctuation
ValueCountFrequency (%)
)157
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2165418
85.3%
Common372768
 
14.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
t488579
22.6%
e340658
15.7%
a187061
 
8.6%
i184556
 
8.5%
n174658
 
8.1%
d167641
 
7.7%
S162751
 
7.5%
s162309
 
7.5%
U160793
 
7.4%
o22004
 
1.0%
Other values (31)114408
 
5.3%
Common
ValueCountFrequency (%)
200867
53.9%
-165369
44.4%
?6119
 
1.6%
(157
 
< 0.1%
)157
 
< 0.1%
&99
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII2538186
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t488579
19.2%
e340658
13.4%
200867
7.9%
a187061
 
7.4%
i184556
 
7.3%
n174658
 
6.9%
d167641
 
6.6%
-165369
 
6.5%
S162751
 
6.4%
s162309
 
6.4%
Other values (37)303737
12.0%

country_of_birth_self
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct43
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
United-States
176989 
Mexico
 
5767
?
 
3393
Puerto-Rico
 
1400
Germany
 
851
Other values (38)
 
11123

Length

Max length29
Median length14
Mean length13.27975722
Min length2

Characters and Unicode

Total characters2649617
Distinct characters47
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row United-States
2nd row United-States
3rd row Vietnam
4th row United-States
5th row United-States

Common Values

ValueCountFrequency (%)
United-States176989
88.7%
Mexico5767
 
2.9%
?3393
 
1.7%
Puerto-Rico1400
 
0.7%
Germany851
 
0.4%
Philippines845
 
0.4%
Cuba837
 
0.4%
Canada700
 
0.4%
Dominican-Republic690
 
0.3%
El-Salvador689
 
0.3%
Other values (33)7362
 
3.7%

Length

2021-09-08T19:23:28.470453image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
united-states176989
88.2%
mexico5767
 
2.9%
3393
 
1.7%
puerto-rico1400
 
0.7%
germany851
 
0.4%
philippines845
 
0.4%
cuba837
 
0.4%
canada700
 
0.3%
dominican-republic690
 
0.3%
el-salvador689
 
0.3%
Other values (39)8409
 
4.2%

Most occurring characters

ValueCountFrequency (%)
t534730
20.2%
e365867
13.8%
200570
 
7.6%
a192481
 
7.3%
i192126
 
7.3%
n185160
 
7.0%
d180622
 
6.8%
-179910
 
6.8%
S178462
 
6.7%
s178172
 
6.7%
Other values (37)261517
9.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter1888049
71.3%
Uppercase Letter377391
 
14.2%
Space Separator200570
 
7.6%
Dash Punctuation179910
 
6.8%
Other Punctuation3459
 
0.1%
Open Punctuation119
 
< 0.1%
Close Punctuation119
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t534730
28.3%
e365867
19.4%
a192481
 
10.2%
i192126
 
10.2%
n185160
 
9.8%
d180622
 
9.6%
s178172
 
9.4%
o12975
 
0.7%
c9805
 
0.5%
x5767
 
0.3%
Other values (11)30344
 
1.6%
Uppercase Letter
ValueCountFrequency (%)
S178462
47.3%
U177227
47.0%
M5767
 
1.5%
P3096
 
0.8%
C2544
 
0.7%
R2090
 
0.6%
G1461
 
0.4%
E1404
 
0.4%
I1238
 
0.3%
D690
 
0.2%
Other values (10)3412
 
0.9%
Other Punctuation
ValueCountFrequency (%)
?3393
98.1%
&66
 
1.9%
Space Separator
ValueCountFrequency (%)
200570
100.0%
Dash Punctuation
ValueCountFrequency (%)
-179910
100.0%
Open Punctuation
ValueCountFrequency (%)
(119
100.0%
Close Punctuation
ValueCountFrequency (%)
)119
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2265440
85.5%
Common384177
 
14.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
t534730
23.6%
e365867
16.1%
a192481
 
8.5%
i192126
 
8.5%
n185160
 
8.2%
d180622
 
8.0%
S178462
 
7.9%
s178172
 
7.9%
U177227
 
7.8%
o12975
 
0.6%
Other values (31)67618
 
3.0%
Common
ValueCountFrequency (%)
200570
52.2%
-179910
46.8%
?3393
 
0.9%
(119
 
< 0.1%
)119
 
< 0.1%
&66
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII2649617
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t534730
20.2%
e365867
13.8%
200570
 
7.6%
a192481
 
7.3%
i192126
 
7.3%
n185160
 
7.0%
d180622
 
6.8%
-179910
 
6.8%
S178462
 
6.7%
s178172
 
6.7%
Other values (37)261517
9.9%

citizenship
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Native- Born in the United States
176992 
Foreign born- Not a citizen of U S
 
13401
Foreign born- U S citizen by naturalization
 
5855
Native- Born abroad of American Parent(s)
 
1756
Native- Born in Puerto Rico or U S Outlying
 
1519

Length

Max length44
Median length34
Mean length34.57431975
Min length34

Characters and Unicode

Total characters6898372
Distinct characters33
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Native- Born in the United States
2nd row Native- Born in the United States
3rd row Foreign born- Not a citizen of U S
4th row Native- Born in the United States
5th row Native- Born in the United States

Common Values

ValueCountFrequency (%)
Native- Born in the United States176992
88.7%
Foreign born- Not a citizen of U S 13401
 
6.7%
Foreign born- U S citizen by naturalization5855
 
2.9%
Native- Born abroad of American Parent(s)1756
 
0.9%
Native- Born in Puerto Rico or U S Outlying1519
 
0.8%

Length

2021-09-08T19:23:28.684915image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-08T19:23:28.752481image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
born199523
16.2%
native180267
14.6%
in178511
14.5%
the176992
14.3%
united176992
14.3%
states176992
14.3%
s20775
 
1.7%
u20775
 
1.7%
citizen19256
 
1.6%
foreign19256
 
1.6%
Other values (12)65013
 
5.3%

Most occurring characters

ValueCountFrequency (%)
1247753
18.1%
t937396
13.6%
e754786
10.9%
n610279
8.8%
i610042
8.8%
a395249
 
5.7%
o259505
 
3.8%
r232940
 
3.4%
-199523
 
2.9%
U197767
 
2.9%
Other values (23)1453132
21.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter4650790
67.4%
Space Separator1247753
 
18.1%
Uppercase Letter796794
 
11.6%
Dash Punctuation199523
 
2.9%
Open Punctuation1756
 
< 0.1%
Close Punctuation1756
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t937396
20.2%
e754786
16.2%
n610279
13.1%
i610042
13.1%
a395249
8.5%
o259505
 
5.6%
r232940
 
5.0%
v180267
 
3.9%
d178748
 
3.8%
s178748
 
3.8%
Other values (10)312830
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
U197767
24.8%
S197767
24.8%
N193668
24.3%
B180267
22.6%
F19256
 
2.4%
P3275
 
0.4%
A1756
 
0.2%
R1519
 
0.2%
O1519
 
0.2%
Space Separator
ValueCountFrequency (%)
1247753
100.0%
Dash Punctuation
ValueCountFrequency (%)
-199523
100.0%
Open Punctuation
ValueCountFrequency (%)
(1756
100.0%
Close Punctuation
ValueCountFrequency (%)
)1756
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin5447584
79.0%
Common1450788
 
21.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t937396
17.2%
e754786
13.9%
n610279
11.2%
i610042
11.2%
a395249
 
7.3%
o259505
 
4.8%
r232940
 
4.3%
U197767
 
3.6%
S197767
 
3.6%
N193668
 
3.6%
Other values (19)1058185
19.4%
Common
ValueCountFrequency (%)
1247753
86.0%
-199523
 
13.8%
(1756
 
0.1%
)1756
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII6898372
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1247753
18.1%
t937396
13.6%
e754786
10.9%
n610279
8.8%
i610042
8.8%
a395249
 
5.7%
o259505
 
3.8%
r232940
 
3.4%
-199523
 
2.9%
U197767
 
2.9%
Other values (23)1453132
21.1%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
0
180672 
2
 
16153
1
 
2698

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters399046
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row 0
2nd row 0
3rd row 0
4th row 0
5th row 0

Common Values

ValueCountFrequency (%)
0180672
90.6%
216153
 
8.1%
12698
 
1.4%

Length

2021-09-08T19:23:28.949966image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-08T19:23:29.013164image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0180672
90.6%
216153
 
8.1%
12698
 
1.4%

Most occurring characters

ValueCountFrequency (%)
199523
50.0%
0180672
45.3%
216153
 
4.0%
12698
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Space Separator199523
50.0%
Decimal Number199523
50.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0180672
90.6%
216153
 
8.1%
12698
 
1.4%
Space Separator
ValueCountFrequency (%)
199523
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common399046
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
199523
50.0%
0180672
45.3%
216153
 
4.0%
12698
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII399046
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
199523
50.0%
0180672
45.3%
216153
 
4.0%
12698
 
0.7%

fill_inc_questionnaire_for_veterans_admin
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
Not in universe
197539 
No
 
1593
Yes
 
391

Length

Max length16
Median length16
Mean length15.87269137
Min length3

Characters and Unicode

Total characters3166967
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row Not in universe
2nd row Not in universe
3rd row Not in universe
4th row Not in universe
5th row Not in universe

Common Values

ValueCountFrequency (%)
Not in universe197539
99.0%
No1593
 
0.8%
Yes391
 
0.2%

Length

2021-09-08T19:23:29.191339image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-08T19:23:29.256769image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
not197539
33.2%
in197539
33.2%
universe197539
33.2%
no1593
 
0.3%
yes391
 
0.1%

Most occurring characters

ValueCountFrequency (%)
594601
18.8%
e395469
12.5%
i395078
12.5%
n395078
12.5%
N199132
 
6.3%
o199132
 
6.3%
s197930
 
6.2%
t197539
 
6.2%
u197539
 
6.2%
v197539
 
6.2%
Other values (2)197930
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter2372843
74.9%
Space Separator594601
 
18.8%
Uppercase Letter199523
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e395469
16.7%
i395078
16.6%
n395078
16.6%
o199132
8.4%
s197930
8.3%
t197539
8.3%
u197539
8.3%
v197539
8.3%
r197539
8.3%
Uppercase Letter
ValueCountFrequency (%)
N199132
99.8%
Y391
 
0.2%
Space Separator
ValueCountFrequency (%)
594601
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin2572366
81.2%
Common594601
 
18.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e395469
15.4%
i395078
15.4%
n395078
15.4%
N199132
7.7%
o199132
7.7%
s197930
7.7%
t197539
7.7%
u197539
7.7%
v197539
7.7%
r197539
7.7%
Common
ValueCountFrequency (%)
594601
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII3166967
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
594601
18.8%
e395469
12.5%
i395078
12.5%
n395078
12.5%
N199132
 
6.3%
o199132
 
6.3%
s197930
 
6.2%
t197539
 
6.2%
u197539
 
6.2%
v197539
 
6.2%
Other values (2)197930
 
6.2%

veterans_benefits
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
2
150130 
0
47409 
1
 
1984

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters399046
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row 2
2nd row 2
3rd row 2
4th row 0
5th row 0

Common Values

ValueCountFrequency (%)
2150130
75.2%
047409
 
23.8%
11984
 
1.0%

Length

2021-09-08T19:23:29.433382image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-08T19:23:29.496344image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
2150130
75.2%
047409
 
23.8%
11984
 
1.0%

Most occurring characters

ValueCountFrequency (%)
199523
50.0%
2150130
37.6%
047409
 
11.9%
11984
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Space Separator199523
50.0%
Decimal Number199523
50.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2150130
75.2%
047409
 
23.8%
11984
 
1.0%
Space Separator
ValueCountFrequency (%)
199523
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common399046
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
199523
50.0%
2150130
37.6%
047409
 
11.9%
11984
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII399046
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
199523
50.0%
2150130
37.6%
047409
 
11.9%
11984
 
0.5%

weeks_worked_in_year
Real number (ℝ≥0)

HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
HIGH CORRELATION
ZEROS

Distinct53
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean23.17489713
Minimum0
Maximum52
Zeros95983
Zeros (%)48.1%
Negative0
Negative (%)0.0%
Memory size1.5 MiB
2021-09-08T19:23:29.575434image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median8
Q352
95-th percentile52
Maximum52
Range52
Interquartile range (IQR)52

Descriptive statistics

Standard deviation24.41148817
Coefficient of variation (CV)1.053359073
Kurtosis-1.863805826
Mean23.17489713
Median Absolute Deviation (MAD)8
Skewness0.2101693419
Sum4623925
Variance595.9207546
MonotonicityNot monotonic
2021-09-08T19:23:29.697740image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
095983
48.1%
5270314
35.2%
402790
 
1.4%
502304
 
1.2%
262268
 
1.1%
481806
 
0.9%
121780
 
0.9%
301378
 
0.7%
201330
 
0.7%
81126
 
0.6%
Other values (43)18444
 
9.2%
ValueCountFrequency (%)
095983
48.1%
1464
 
0.2%
2458
 
0.2%
3417
 
0.2%
4757
 
0.4%
5309
 
0.2%
6646
 
0.3%
7152
 
0.1%
81126
 
0.6%
9239
 
0.1%
ValueCountFrequency (%)
5270314
35.2%
51819
 
0.4%
502304
 
1.2%
49509
 
0.3%
481806
 
0.9%
47278
 
0.1%
46708
 
0.4%
45669
 
0.3%
44845
 
0.4%
43374
 
0.2%

year
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
94
99827 
95
99696 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters598569
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row 95
2nd row 94
3rd row 95
4th row 94
5th row 94

Common Values

ValueCountFrequency (%)
9499827
50.0%
9599696
50.0%

Length

2021-09-08T19:23:29.896607image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-08T19:23:29.957500image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
9499827
50.0%
9599696
50.0%

Most occurring characters

ValueCountFrequency (%)
199523
33.3%
9199523
33.3%
499827
16.7%
599696
16.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number399046
66.7%
Space Separator199523
33.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9199523
50.0%
499827
25.0%
599696
25.0%
Space Separator
ValueCountFrequency (%)
199523
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common598569
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
199523
33.3%
9199523
33.3%
499827
16.7%
599696
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII598569
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
199523
33.3%
9199523
33.3%
499827
16.7%
599696
16.7%

label
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size1.5 MiB
0
187141 
1
 
12382

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters199523
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0187141
93.8%
112382
 
6.2%

Length

2021-09-08T19:23:30.116906image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-08T19:23:30.179484image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
ValueCountFrequency (%)
0187141
93.8%
112382
 
6.2%

Most occurring characters

ValueCountFrequency (%)
0187141
93.8%
112382
 
6.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number199523
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0187141
93.8%
112382
 
6.2%

Most occurring scripts

ValueCountFrequency (%)
Common199523
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0187141
93.8%
112382
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII199523
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0187141
93.8%
112382
 
6.2%

Interactions

2021-09-08T19:22:50.955639image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:51.137794image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:51.454645image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:51.691266image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:51.822449image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:51.952353image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:52.080199image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:52.208164image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:52.343307image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:52.472359image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:52.642309image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:53.190423image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:53.946161image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:54.561735image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:55.027116image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:55.497035image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:55.983959image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:56.461243image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:56.922957image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:57.389984image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:57.892790image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:58.200558image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:58.655291image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:58.988983image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:59.251239image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:59.531203image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:22:59.821517image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:00.103210image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:00.391627image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:00.706076image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:00.974594image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:01.159864image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:01.491821image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:01.765185image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:01.913859image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:02.050181image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:02.187392image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:02.333226image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:02.479220image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:02.618087image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:02.754357image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:02.922888image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:03.675462image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:03.895227image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:04.021629image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:04.172009image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:04.316385image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:04.454817image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:04.595660image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:04.735622image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:04.879742image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:05.040220image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:05.343541image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:05.550022image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:05.713662image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:05.834851image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:05.962450image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:06.111453image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:06.247370image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:06.385225image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:06.561983image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:06.754187image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:07.055477image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:07.299388image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:07.469768image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:07.619746image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:07.753344image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:07.886749image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:08.026014image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:08.170688image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:08.323427image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:08.499751image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:08.812261image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:09.034802image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:09.182142image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:09.354803image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:09.495114image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:09.633476image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:09.775203image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:09.917343image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:10.062486image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:10.231059image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:10.548525image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:10.780129image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:10.921414image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:11.070487image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:11.214564image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:11.362058image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:11.513705image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:11.661820image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:11.796131image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:11.948868image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:12.231255image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:12.455142image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:12.583546image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:12.765010image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:12.933319image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:13.102214image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:13.260878image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
2021-09-08T19:23:13.407021image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Correlations

2021-09-08T19:23:30.246171image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-09-08T19:23:30.403591image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-09-08T19:23:30.553462image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-09-08T19:23:30.764412image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-09-08T19:23:31.167417image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-09-08T19:23:14.346950image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
A simple visualization of nullity by column.
2021-09-08T19:23:16.181531image/svg+xmlMatplotlib v3.4.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

ageclass_of_workerindustry_codeoccupation_codeeducationwage_per_hourenrolled_in_edu_inst_last_wkmarital_statusmajor_industry_codemajor_occupation_coderacehispanic_Originsexmember_of_a_labor_unionreason_for_unemploymentfull_or_part_time_employment_statcapital_gainscapital_lossesdivdends_from_stockstax_filer_statusregion_of_previous_residencestate_of_previous_residencedetailed_household_and_family_statdetailed_household_summary_in_householdinstance_weightmigration_code_change_in_msamigration_code_change_in_regmigration_code_move_within_reglive_in_this_house_1_year_agomigration_prev_res_in_sunbeltnum_persons_worked_for_employerfamily_members_under_18country_of_birth_fathercountry_of_birth_mothercountry_of_birth_selfcitizenshipown_business_or_self_employedfill_inc_questionnaire_for_veterans_adminveterans_benefitsweeks_worked_in_yearyearlabel
073Not in universe00High school graduate0Not in universeWidowedNot in universe or childrenNot in universeWhiteAll otherFemaleNot in universeNot in universeNot in labor force000NonfilerNot in universeNot in universeOther Rel 18+ ever marr not in subfamilyOther relative of householder1700.09???Not in universe under 1 year old?0Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe20950
158Self-employed-not incorporated434Some college but no degree0Not in universeDivorcedConstructionPrecision production craft & repairWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces000Head of householdSouthArkansasHouseholderHouseholder1053.55MSA to MSASame countySame countyNoYes1Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe252940
218Not in universe0010th grade0High schoolNever marriedNot in universe or childrenNot in universeAsian or Pacific IslanderAll otherFemaleNot in universeNot in universeNot in labor force000NonfilerNot in universeNot in universeChild 18+ never marr Not in a subfamilyChild 18 or older991.95???Not in universe under 1 year old?0Not in universeVietnamVietnamVietnamForeign born- Not a citizen of U S0Not in universe20950
39Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married1758.14NonmoverNonmoverNonmoverYesNot in universe0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe00940
410Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married1069.16NonmoverNonmoverNonmoverYesNot in universe0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe00940
548Private4010Some college but no degree1200Not in universeMarried-civilian spouse presentEntertainmentProfessional specialtyAmer Indian Aleut or EskimoAll otherFemaleNoNot in universeFull-time schedules000Joint both under 65Not in universeNot in universeSpouse of householderSpouse of householder162.61???Not in universe under 1 year old?1Not in universePhilippinesUnited-StatesUnited-StatesNative- Born in the United States2Not in universe252950
642Private343Bachelors degree(BA AB BS)0Not in universeMarried-civilian spouse presentFinance insurance and real estateExecutive admin and managerialWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces517800Joint both under 65Not in universeNot in universeHouseholderHouseholder1535.86NonmoverNonmoverNonmoverYesNot in universe6Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe252940
728Private440High school graduate0Not in universeNever marriedConstructionHandlers equip cleaners etcWhiteAll otherFemaleNot in universeJob loser - on layoffUnemployed full-time000SingleNot in universeNot in universeSecondary individualNonrelative of householder898.83???Not in universe under 1 year old?4Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe230950
847Local government4326Some college but no degree876Not in universeMarried-civilian spouse presentEducationAdm support including clericalWhiteAll otherFemaleNoNot in universeFull-time schedules000Joint both under 65Not in universeNot in universeSpouse of householderSpouse of householder1661.53???Not in universe under 1 year old?5Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe252950
934Private437Some college but no degree0Not in universeMarried-civilian spouse presentConstructionMachine operators assmblrs & inspctrsWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces000Joint both under 65Not in universeNot in universeHouseholderHouseholder1146.79NonmoverNonmoverNonmoverYesNot in universe6Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe252940

Last rows

ageclass_of_workerindustry_codeoccupation_codeeducationwage_per_hourenrolled_in_edu_inst_last_wkmarital_statusmajor_industry_codemajor_occupation_coderacehispanic_Originsexmember_of_a_labor_unionreason_for_unemploymentfull_or_part_time_employment_statcapital_gainscapital_lossesdivdends_from_stockstax_filer_statusregion_of_previous_residencestate_of_previous_residencedetailed_household_and_family_statdetailed_household_summary_in_householdinstance_weightmigration_code_change_in_msamigration_code_change_in_regmigration_code_move_within_reglive_in_this_house_1_year_agomigration_prev_res_in_sunbeltnum_persons_worked_for_employerfamily_members_under_18country_of_birth_fathercountry_of_birth_mothercountry_of_birth_selfcitizenshipown_business_or_self_employedfill_inc_questionnaire_for_veterans_adminveterans_benefitsweeks_worked_in_yearyearlabel
19951357Private9379th grade0Not in universeDivorcedManufacturing-durable goodsMachine operators assmblrs & inspctrsWhiteCentral or South AmericanFemaleNot in universeNot in universeFull-time schedules000SingleNot in universeNot in universeHouseholderHouseholder743.66???Not in universe under 1 year old?4Not in universeDominican-RepublicDominican-RepublicDominican-RepublicForeign born- Not a citizen of U S0Not in universe252950
19951451Private331910th grade0Not in universeWidowedRetail tradeSalesWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000SingleSouthNorth DakotaHouseholderHouseholder1302.34NonMSA to nonMSASame countySame countyNoYes6Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe252940
19951587Not in universe00High school graduate0Not in universeWidowedNot in universe or childrenNot in universeWhiteAll otherFemaleNot in universeNot in universeNot in labor force000SingleNot in universeNot in universeNonfamily householderHouseholder3255.80???Not in universe under 1 year old?0Not in universe?United-StatesUnited-StatesNative- Born in the United States0Not in universe20950
1995163Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeBlackAll otherMaleNot in universeNot in universeChildren or Armed Forces000NonfilerSouthUtahChild under 18 of RP of unrel subfamilyNonrelative of householder2733.75MSA to MSASame countySame countyNoYes0Mother only presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe00940
19951739Private4326Bachelors degree(BA AB BS)0Not in universeNever marriedEducationAdm support including clericalOtherMexican-AmericanMaleNoNot in universeFull-time schedules684900SingleNot in universeNot in universeNonfamily householderHouseholder908.14???Not in universe under 1 year old?6Not in universeMexicoMexicoMexicoForeign born- Not a citizen of U S2Not in universe252950
19951887Not in universe007th and 8th grade0Not in universeMarried-civilian spouse presentNot in universe or childrenNot in universeWhiteAll otherMaleNot in universeNot in universeNot in labor force000Joint both 65+Not in universeNot in universeHouseholderHouseholder955.27???Not in universe under 1 year old?0Not in universeCanadaUnited-StatesUnited-StatesNative- Born in the United States0Not in universe20950
19951965Self-employed-incorporated37211th grade0Not in universeMarried-civilian spouse presentBusiness and repair servicesExecutive admin and managerialWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces641809Joint one under 65 & one 65+Not in universeNot in universeHouseholderHouseholder687.19NonmoverNonmoverNonmoverYesNot in universe1Not in universeUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe252940
19952047Not in universe00Some college but no degree0Not in universeMarried-civilian spouse presentNot in universe or childrenNot in universeWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces00157Joint both under 65Not in universeNot in universeHouseholderHouseholder1923.03???Not in universe under 1 year old?6Not in universePolandPolandGermanyForeign born- U S citizen by naturalization0Not in universe252950
19952116Not in universe0010th grade0High schoolNever marriedNot in universe or childrenNot in universeWhiteAll otherFemaleNot in universeNot in universeNot in labor force000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married4664.87???Not in universe under 1 year old?0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe20950
19952232Private4230High school graduate0Not in universeNever marriedMedical except hospitalOther serviceBlackAll otherFemaleNoNot in universeChildren or Armed Forces000SingleNot in universeNot in universeNonfamily householderHouseholder1830.11NonmoverNonmoverNonmoverYesNot in universe6Not in universe???Foreign born- Not a citizen of U S0Not in universe252940

Duplicate rows

Most frequently occurring

ageclass_of_workerindustry_codeoccupation_codeeducationwage_per_hourenrolled_in_edu_inst_last_wkmarital_statusmajor_industry_codemajor_occupation_coderacehispanic_Originsexmember_of_a_labor_unionreason_for_unemploymentfull_or_part_time_employment_statcapital_gainscapital_lossesdivdends_from_stockstax_filer_statusregion_of_previous_residencestate_of_previous_residencedetailed_household_and_family_statdetailed_household_summary_in_householdinstance_weightmigration_code_change_in_msamigration_code_change_in_regmigration_code_move_within_reglive_in_this_house_1_year_agomigration_prev_res_in_sunbeltnum_persons_worked_for_employerfamily_members_under_18country_of_birth_fathercountry_of_birth_mothercountry_of_birth_selfcitizenshipown_business_or_self_employedfill_inc_questionnaire_for_veterans_adminveterans_benefitsweeks_worked_in_yearyearlabel# duplicates
5593Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married2125.99???Not in universe under 1 year old?0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe009506
194711Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married1131.62NonmoverNonmoverNonmoverYesNot in universe0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe009406
1040Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married1363.88???Not in universe under 1 year old?0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe009505
3582Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married1182.42???Not in universe under 1 year old?0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe009505
5903Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married966.31???Not in universe under 1 year old?0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe009505
6033Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married1220.24???Not in universe under 1 year old?0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe009505
6273Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherMaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married1803.03NonmoverNonmoverNonmoverYesNot in universe0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe009405
8815Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married886.02???Not in universe under 1 year old?0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe009505
14338Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married1215.87???Not in universe under 1 year old?0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe009505
14538Not in universe00Children0Not in universeNever marriedNot in universe or childrenNot in universeWhiteAll otherFemaleNot in universeNot in universeChildren or Armed Forces000NonfilerNot in universeNot in universeChild <18 never marr not in subfamilyChild under 18 never married1979.97???Not in universe under 1 year old?0Both parents presentUnited-StatesUnited-StatesUnited-StatesNative- Born in the United States0Not in universe009505